WEBVTT
Kind: captions
Language: en-US
00:00:00.130 --> 00:00:02.780
… everyone for coming to this
week’s Earthquake Science seminar.
00:00:02.780 --> 00:00:06.140
I have a few announcements
before we get started.
00:00:06.140 --> 00:00:12.340
Next week, we will officially be
moving this seminar to Moffett Field.
00:00:12.340 --> 00:00:18.560
So Austin Elliott’s seminar
will be in Building 19.
00:00:18.560 --> 00:00:21.499
The conference room is
still to be determined.
00:00:21.499 --> 00:00:25.390
So we will have probably
an announcement by email
00:00:25.390 --> 00:00:29.089
about where that
location will be.
00:00:29.089 --> 00:00:32.960
As you probably know, tomorrow is
the 30th anniversary of the Loma Prieta
00:00:32.960 --> 00:00:36.600
earthquake, and there are plenty
of events going on related to that,
00:00:36.600 --> 00:00:41.280
including a governor’s press conference
at 11:00 a.m. tomorrow and a few
00:00:41.280 --> 00:00:45.180
nighttime events in San Francisco
at the Exploratorium and the
00:00:45.180 --> 00:00:52.170
California Academy of Sciences,
featuring USGS speakers.
00:00:52.170 --> 00:00:54.520
So come out if you can.
00:00:56.340 --> 00:00:59.200
And, with that, it’s my
pleasure to introduce
00:00:59.200 --> 00:01:02.180
this week’s speaker,
Nico Kuehn.
00:01:02.190 --> 00:01:06.960
He got his Ph.D. at the University
of Potsdam and was there for
00:01:06.960 --> 00:01:13.700
a few years as a research scientist.
He then came to the Pacific Earthquake
00:01:13.700 --> 00:01:18.450
Engineering Research Center – PEER –
at Berkeley as a postdoc and is now
00:01:18.450 --> 00:01:25.080
at the UCLA Garrick Risk Institute
as a research scientist.
00:01:25.080 --> 00:01:33.520
And today he’ll be talking about the
use of nonergodic GMPEs in PSHA.
00:01:35.260 --> 00:01:37.899
- Thank you, Grace.
Can everybody hear me okay?
00:01:37.899 --> 00:01:41.479
All right. Perfect.
And thank you for having me.
00:01:41.479 --> 00:01:45.859
So I’ll talk about – mainly I’ll talk
about nonergodic ground motion
00:01:45.859 --> 00:01:49.149
prediction equations, or I probably
should have called that
00:01:49.149 --> 00:01:53.600
ground motion models,
but pretty much the same thing.
00:01:53.600 --> 00:01:59.579
And then a little bit about the use in
PSHA and how that’s going to change.
00:02:00.220 --> 00:02:07.840
Oh, this one. So the reason why we’re
doing this is summarized in this plot.
00:02:07.840 --> 00:02:09.920
I should probably use this.
00:02:09.920 --> 00:02:12.250
So this is a plot from
Norm Abrahamson’s paper
00:02:12.250 --> 00:02:18.450
from 2019 showing a comparison
of an ergodic seismic hazard analysis
00:02:18.450 --> 00:02:21.500
and a nonergodic
PSHA for two sites –
00:02:21.500 --> 00:02:25.720
one in San Jose and
one in northeastern California.
00:02:25.720 --> 00:02:30.390
And the ergodic hazard
curves are shown in blue,
00:02:30.390 --> 00:02:33.890
and the nonergodic hazard
curves are shown in red.
00:02:33.890 --> 00:02:36.330
And nonergodic is pretty much
the standard the way we’re
00:02:36.330 --> 00:02:41.600
doing it right now pretty routinely.
And the thing is, for nonergodic,
00:02:41.600 --> 00:02:47.140
things change quite a bit.
In San Jose, on the left, they stay –
00:02:47.150 --> 00:02:51.360
at least the mean hazard
stays pretty much the same.
00:02:51.360 --> 00:02:55.020
It’s very similar in
northeastern California.
00:02:55.020 --> 00:02:59.280
And so I’m going to talk about
how did we do – so, how did we
00:02:59.280 --> 00:03:03.090
come up with these hazard curves,
and why are these so different.
00:03:03.090 --> 00:03:05.260
You probably know all this,
and it’s not – it’s been known
00:03:05.260 --> 00:03:08.820
for a while that this is going to
change hazard quite a bit.
00:03:08.820 --> 00:03:14.290
The main thing to note is, on the left,
San Jose, Bay Area, there is some
00:03:14.290 --> 00:03:17.260
data around. That means we
change the hazard quite a bit.
00:03:17.260 --> 00:03:20.440
Northeastern California,
not really that much data.
00:03:20.440 --> 00:03:23.260
That means the mean
hazard stays the same.
00:03:23.260 --> 00:03:27.900
But we increase the range of
the hazard curve distribution.
00:03:27.900 --> 00:03:31.300
So we got much wider
uncertainty in the hazard curves.
00:03:31.300 --> 00:03:34.320
And that’s pretty much
all there is to say about this.
00:03:34.320 --> 00:03:37.220
And we’ll just go
through how got it.
00:03:37.220 --> 00:03:42.510
So basically, the way Norm did this is
he combined two methods – one from
00:03:42.510 --> 00:03:50.650
Landwehr et al. 2016, and one from the
one we did from 2019 for path effects.
00:03:50.650 --> 00:03:55.260
And I’ll just probably mainly talk
about the Landwehr et al. paper
00:03:55.260 --> 00:03:59.080
and the methodology about that
because I think that’s the way we are
00:03:59.080 --> 00:04:03.840
currently envisioning going
forward with nonergodic GMPEs.
00:04:03.900 --> 00:04:08.800
There are some pitfalls in doing that,
and I’ll also briefly talk about this.
00:04:08.800 --> 00:04:14.100
This is also – both Norm and
Niels Landwehr in Potsdam,
00:04:14.100 --> 00:04:16.440
so they’re really
instrumental in this.
00:04:16.440 --> 00:04:22.790
And I’m mainly showing this –
a lot of their work as well.
00:04:22.790 --> 00:04:28.050
So I assume you’re all
familiar with seismic hazard.
00:04:28.050 --> 00:04:33.090
If not, this is just a brief refresher.
We’re trying to estimate the
00:04:33.090 --> 00:04:36.030
rate of exceedance for some
ground motion parameter,
00:04:36.030 --> 00:04:40.810
like PGA or response
spectrum or something.
00:04:40.810 --> 00:04:44.290
The rate of exceedance for
a certain ground motion level A,
00:04:44.290 --> 00:04:47.840
and that’s just by integrating over
the ground motion distribution
00:04:47.840 --> 00:04:51.380
and integrating all possible
magnitudes and distances, and then
00:04:51.380 --> 00:04:55.300
summing that up over all possible
sources. That’s just basically it.
00:04:55.310 --> 00:04:58.490
And the key ingredient for that is
the ground motion distribution.
00:04:58.490 --> 00:05:01.660
So what we need to know is,
given an earthquake scenario,
00:05:01.660 --> 00:05:05.910
given a magnitude, the distance,
and maybe some other parameters,
00:05:05.910 --> 00:05:09.500
what kind of ground
motions can we expect?
00:05:09.500 --> 00:05:13.100
And that is typically modeled
as a log normal distribution,
00:05:13.100 --> 00:05:17.220
which has a certain median and a
certain standard deviation, sigma.
00:05:17.220 --> 00:05:21.840
And the median is a function of
magnitude, distance, and so on.
00:05:21.840 --> 00:05:27.510
Vs30 and event type and
depth are some other things.
00:05:27.510 --> 00:05:31.480
And often – and then the standard
deviation can be a function of those
00:05:31.480 --> 00:05:34.840
as well, but it’s also –
it’s the sum of multiple terms.
00:05:34.840 --> 00:05:39.990
And depending a bit on what we kind of
think is – this is noise, this is not noise
00:05:39.990 --> 00:05:44.590
that we can model, we’re kind of
moving some parts from here to there.
00:05:44.590 --> 00:05:48.660
And I’ll talk a little bit
how that’s going to go.
00:05:48.660 --> 00:05:54.460
In a simple example, just to clarify,
or to introduce – to illustrate what a
00:05:54.460 --> 00:06:00.720
nonergodic hazard is, imagine we have
20 sites in California, or just 20 sites,
00:06:00.720 --> 00:06:06.680
and each of them is only affected by a
magnitude 6 in 20 kilometers’ distance.
00:06:06.680 --> 00:06:11.710
That just makes simulation a lot easier.
So I simulated some simple example.
00:06:11.710 --> 00:06:16.830
And these events are super frequent,
so at each site, we have 100 recordings.
00:06:16.830 --> 00:06:20.380
So the way we would do this right
now is, we – well, the way it’s been –
00:06:20.380 --> 00:06:25.520
it’s done right now is to get
a ground motion model for this.
00:06:27.120 --> 00:06:30.091
So, if we want to estimate the
probability of exceedance, we need
00:06:30.091 --> 00:06:33.230
to have a ground motion model.
For this case, it’s pretty simple.
00:06:33.230 --> 00:06:38.580
We’ll just look at the data, calculate
the mean and the – or, the median
00:06:38.580 --> 00:06:42.680
and the standard deviation, and then
we can have some estimate of our
00:06:42.680 --> 00:06:44.139
ground motion distribution.
00:06:44.140 --> 00:06:48.720
So this is all the data that
we have – 2,000 data points.
00:06:48.720 --> 00:06:52.860
Log PGA. And that’s just the histogram
of the event types, so that’s a nice
00:06:52.860 --> 00:06:56.640
normal distribution, so that would
be log normal distribution PGA.
00:06:56.650 --> 00:07:01.300
So what we can do is we can just
go ahead and look at – oh, this is –
00:07:01.300 --> 00:07:04.370
this is the mean, and then we get some
estimate of the standard deviation,
00:07:04.370 --> 00:07:10.840
and then, if we want to know what’s
the probability that the PGA is larger
00:07:10.840 --> 00:07:13.931
than some certain value,
we just go around here,
00:07:13.940 --> 00:07:19.580
and look it up or calculate
the CDF or 1 minus the CDF.
00:07:19.580 --> 00:07:22.620
This is all pretty nice.
We can make some statistic tests.
00:07:22.620 --> 00:07:27.199
Oh, this looks – this is a probability
plot, so this is really a normal
00:07:27.199 --> 00:07:30.310
distribution, so we can do
all these kinds of things.
00:07:30.310 --> 00:07:33.699
And then we go ahead, and then we
can calculate the empirical mean and
00:07:33.699 --> 00:07:38.430
standard deviation and then just
calculate the probability of occurrence.
00:07:38.430 --> 00:07:44.120
So this is the exceedance probability
that Y – that PGA would be larger than
00:07:44.120 --> 00:07:48.220
some ground motion value,
and then this is the hazard curve.
00:07:48.220 --> 00:07:52.930
And then we can also look at the
standard error of the mean and say –
00:07:52.930 --> 00:07:57.020
and get some uncertainty around that
mean, put that on the hazard curve.
00:07:57.020 --> 00:08:00.710
You probably can’t really see that here,
but there are some – there are some
00:08:00.710 --> 00:08:02.740
dashed lines around that,
which would be, like,
00:08:02.740 --> 00:08:06.520
some uncertainty around
this mean hazard curve.
00:08:06.520 --> 00:08:09.020
The problem with this is that
this is actually not correct.
00:08:09.020 --> 00:08:13.900
Because, if we look at the sites
individually, we see clear
00:08:13.900 --> 00:08:16.790
differences in their distribution.
This is just one example.
00:08:16.790 --> 00:08:20.780
These are two sites. We know we
have a lot less recordings per site.
00:08:20.780 --> 00:08:24.890
But we also see that the means of
those two are really, really different.
00:08:24.890 --> 00:08:29.130
So that means, if we just aggregate
those two sites together, calculate an
00:08:29.130 --> 00:08:33.560
empirical mean, we’re going to make –
we’re going to be fine, like, on average,
00:08:33.560 --> 00:08:37.580
for those two sites. But individually,
we’re going to be wrong.
00:08:37.580 --> 00:08:43.560
And if we do this for all those 20 sites –
so we can actually just go ahead
00:08:43.560 --> 00:08:48.470
here and say this is the mean for Site 1.
This is the mean for Site 2.
00:08:48.470 --> 00:08:50.930
And then we go through
all those 20 sites.
00:08:50.930 --> 00:08:53.900
And then we get these
individual hazard curves.
00:08:53.900 --> 00:08:56.460
I also plotted the mean
of those hazard curves.
00:08:56.470 --> 00:09:01.190
So the mean of those hazard curves
is actually the same as the mean
00:09:01.190 --> 00:09:04.450
that is calculated when
we aggregate the data.
00:09:04.450 --> 00:09:08.500
What we see is, we see a really, really
big spread along those 20 sites.
00:09:08.500 --> 00:09:11.740
I mean, this is simulated data,
but this is also something
00:09:11.740 --> 00:09:14.970
that you will
see in practice.
00:09:14.970 --> 00:09:20.220
So that means, if we just aggregate all
those data from those 20 sites, calculate
00:09:20.220 --> 00:09:23.810
the hazard curve, we will be, on average,
correct over all those 20 sites.
00:09:23.810 --> 00:09:27.260
But, at each site,
we will be wrong.
00:09:27.260 --> 00:09:31.310
And then this is just – so basically
what that means is the probability
00:09:31.310 --> 00:09:34.510
of exceedance becomes
a lot less certain.
00:09:34.510 --> 00:09:39.810
So this is showing the mean ergodic and
the mean ergodic plus/minus the sigma
00:09:39.810 --> 00:09:46.180
calculated from the standard error
of the mean of the aggregated data.
00:09:46.180 --> 00:09:55.430
And those are the plus/minus 1 sigma
for – I call this nonergodic hazard here.
00:09:55.430 --> 00:09:59.580
So this would be the –
this would be the five and –
00:09:59.580 --> 00:10:03.760
or, plus/minus sigma of
those 20 hazard curves.
00:10:03.770 --> 00:10:08.010
So the thing why this becomes
important is, if we just now –
00:10:08.010 --> 00:10:12.810
we go to a new site, and I say,
this new site is also affected by
00:10:12.810 --> 00:10:17.800
just magnitude 6 and 20 kilometers,
but we don’t have any data.
00:10:17.800 --> 00:10:22.350
So if we want to calculate the hazard
for that – or, the probability of
00:10:22.350 --> 00:10:26.310
exceedance for that new site, we don’t
really know what it’s going to be.
00:10:26.310 --> 00:10:33.020
So we can just aggregate all those data,
and then we get the black line.
00:10:33.020 --> 00:10:38.280
But, if we just do that, we severely
underestimate the uncertainty.
00:10:38.290 --> 00:10:42.330
because, from what we see,
it’s probable that there will be
00:10:42.330 --> 00:10:47.430
some really strong site effect or really
strong effect at that particular site.
00:10:47.430 --> 00:10:49.310
So it could be
somewhere around here.
00:10:49.310 --> 00:10:53.110
It could be any of those 20 curves,
or it could be somewhere in between.
00:10:53.110 --> 00:10:56.610
So that means we should –
even if we don’t know anything
00:10:56.610 --> 00:11:00.220
about that particular site,
we need to take into account that
00:11:00.220 --> 00:11:05.100
there will probably be strong
effects that we don’t model.
00:11:05.100 --> 00:11:11.080
Otherwise, if we then collect data
at that new site, and it – then the
00:11:11.080 --> 00:11:14.010
hazard curve will shift. So it will –
it will be very different from
00:11:14.010 --> 00:11:17.170
the black line. And then people will
complain, and people complain
00:11:17.170 --> 00:11:20.890
all the time – why does the
hazard change? What happened?
00:11:20.890 --> 00:11:24.720
Well, the hazard changed, but we
were really, really uncertain before.
00:11:24.720 --> 00:11:28.649
So now we’ve got more data,
but now we just move it a little bit
00:11:28.649 --> 00:11:32.310
around in this uncertainty range.
And that’s just really, really important
00:11:32.310 --> 00:11:36.170
to know that we shouldn’t
just look at the mean hazard.
00:11:36.170 --> 00:11:40.060
We need to capture the – well,
we try to capture – the uncertainties
00:11:40.060 --> 00:11:44.740
are captured in this if we just move it
into the standard deviation.
00:11:44.740 --> 00:11:52.519
But that’s just – that won’t make it –
make it less easy to then go, once we
00:11:52.520 --> 00:11:58.600
collect data, do something else and try
to incorporate the site-specific hazards.
00:11:58.600 --> 00:12:03.060
So basically, what happens is, if we just
aggregate those data from the 20 sites,
00:12:03.060 --> 00:12:06.740
we get larger variability in
our ground motion distribution.
00:12:06.740 --> 00:12:08.860
But we get really,
really small uncertainty.
00:12:08.860 --> 00:12:13.480
We’ve got 2,000 records,
and it’s easy to just get
00:12:13.480 --> 00:12:17.540
a good estimate of the
mean from 2,000 records.
00:12:17.540 --> 00:12:22.760
It does – it ignores the differences
between the distributions at each site.
00:12:22.780 --> 00:12:28.620
On average, over all 20 sites, we’ll be
correct, but at each site, we’ll be wrong.
00:12:30.640 --> 00:12:34.410
The individual hazard curves and the
individual site can be very different.
00:12:34.410 --> 00:12:39.740
And they will be very different.
And it is important to recognize that
00:12:39.740 --> 00:12:46.570
and also to communicate that to people
that the hazard curves – because we –
00:12:46.570 --> 00:12:50.269
even if we don’t have any data,
they will be very, very different
00:12:50.269 --> 00:12:55.360
from what – from the mean hazard.
It’s very, very uncertain.
00:12:55.360 --> 00:13:01.470
All this coming together is that,
if we just aggregate all the data together,
00:13:01.470 --> 00:13:04.720
that means we’re kind of making the
ergodic assumption, which means
00:13:04.720 --> 00:13:09.910
that the – that we think the ground
motion distribution of all sites is
00:13:09.910 --> 00:13:15.660
the same as the ground motion
distribution at one site over space.
00:13:15.660 --> 00:13:17.040
And that’s wrong.
00:13:17.040 --> 00:13:21.900
And people have been – have known
that for a long time that that’s wrong.
00:13:21.900 --> 00:13:26.720
But it’s very easy, and it’s very
convenient, to make these assumptions
00:13:26.720 --> 00:13:30.700
because it makes the statistics very easy,
and you can – you can aggregate data
00:13:30.700 --> 00:13:34.649
from many different sources.
And you don’t have to go to
00:13:34.649 --> 00:13:38.100
some complicated model
to take that into account.
00:13:38.100 --> 00:13:42.670
Because it was very easy to do that in
the simulated example because it’s just
00:13:42.670 --> 00:13:46.959
one magnitude and distance scenario.
But once you aggregate data from
00:13:46.959 --> 00:13:49.839
more sources and over
many different magnitudes,
00:13:49.839 --> 00:13:53.230
things just become
quite messy.
00:13:53.230 --> 00:14:00.269
So just a little bit of math –
math is nice – to see how that works.
00:14:00.269 --> 00:14:05.530
So, in this – in the ergodic example,
by aggregating, we calculated the mean
00:14:05.530 --> 00:14:09.820
and standard deviation from all the –
from the aggregate data.
00:14:09.820 --> 00:14:15.850
For the nonergodic exceedance
probability at one site, we just take
00:14:15.850 --> 00:14:21.100
the site at that – the median at that
particular site and the standard deviation
00:14:21.100 --> 00:14:26.320
of the distribution at that particular site.
And then we can – we can do this.
00:14:26.320 --> 00:14:32.580
We can usually write this – instead of
saying, oh, we’ll just take the median
00:14:32.580 --> 00:14:37.670
at that particular site, we can also
just take the average of all sites
00:14:37.670 --> 00:14:42.000
and then just have an adjustment.
So it’s either – at that site, it’s either
00:14:42.000 --> 00:14:46.360
larger than the – than the average
or it’s smaller than the average.
00:14:46.360 --> 00:14:50.590
And what we typically – or, what we
should see if we just go site-specific,
00:14:50.590 --> 00:14:52.660
we have a smaller
value of the aleatory –
00:14:52.660 --> 00:14:57.480
I mean, that must happen because
otherwise, if we aggregate data,
00:14:57.480 --> 00:15:00.350
it’s always
going to be larger.
00:15:00.350 --> 00:15:05.330
So what we have is, each site has
an individual median value.
00:15:05.330 --> 00:15:10.920
And we typically try to write this
as some adjustment to some
00:15:10.920 --> 00:15:15.700
aggregated value, and it usually
also has smaller variability
00:15:15.700 --> 00:15:19.120
than the aggregated
distribution of the sites.
00:15:19.930 --> 00:15:22.120
So people have known
that for a long time,
00:15:22.120 --> 00:15:27.600
and a lot of work has been done in
trying to relax the ergodic assumptions.
00:15:28.240 --> 00:15:32.680
So basically, I think,
if you go, like, 20 years ago,
00:15:32.680 --> 00:15:36.870
data was quite sparse,
and data just had to be aggregated.
00:15:36.870 --> 00:15:41.660
And then people started doing
global models, so that means –
00:15:41.660 --> 00:15:45.140
and that’s not 20 years ago,
but for example, NGA-West1,
00:15:45.140 --> 00:15:48.520
they did – they had global models.
And they had data from, let’s say,
00:15:48.520 --> 00:15:52.560
Japan in there, California,
Europe, Taiwan.
00:15:52.570 --> 00:15:55.769
And they just make
one particular model.
00:15:55.769 --> 00:16:00.940
And then – I’m actually –
I’m one slide ahead, actually.
00:16:00.940 --> 00:16:09.910
So what I actually wanted to say
with this slide is, from an ergodic –
00:16:09.910 --> 00:16:12.579
moving from an ergodic to
nonergodic is actually –
00:16:12.579 --> 00:16:16.930
we just take an ergodic and
we make some adjustments.
00:16:16.930 --> 00:16:20.139
And those adjustments need to
become location-specific, right?
00:16:20.139 --> 00:16:24.200
Like, in the other – in the simple
example, there were adjustments
00:16:24.200 --> 00:16:29.450
at each particular site.
And – but in the real world,
00:16:29.450 --> 00:16:33.410
we just make them location-specific.
So that means we have – if we have
00:16:33.410 --> 00:16:37.490
an earthquake in a particular location,
we maybe add something to the median
00:16:37.490 --> 00:16:41.410
or subtract something
from the median.
00:16:41.410 --> 00:16:45.600
At a particular site, we have
site effects, so we add that.
00:16:45.600 --> 00:16:49.850
And then there might be path effects that
are not modeled in an ergodic model,
00:16:49.850 --> 00:16:56.150
so we also have to account for that.
So, throughout this talk, I will
00:16:56.150 --> 00:17:01.670
use t for location, which was
used in the Landwehr paper,
00:17:01.670 --> 00:17:07.260
and I just decided to
be consistent with this.
00:17:07.260 --> 00:17:14.520
So this should be – this should be there
at every particular – at every particular
00:17:14.520 --> 00:17:19.410
scenario that we want to calculate.
If we have a source and we have a site,
00:17:19.410 --> 00:17:24.690
that – and for, let’s say, a magnitude 6 in
20 kilometers or in 100 kilometers, that
00:17:24.690 --> 00:17:28.840
should be different if we’re in northern
California or in southern California.
00:17:28.840 --> 00:17:34.390
Or that should be different even if the –
if we have one source, and the path
00:17:34.390 --> 00:17:38.299
goes to the east or the path
goes to the – to the west.
00:17:38.300 --> 00:17:42.540
Because the geology will be different,
and there will be different path effects.
00:17:42.540 --> 00:17:48.500
Sometimes these things will be strong,
and sometimes these will be less strong.
00:17:49.120 --> 00:17:53.610
Since this is something,
if we go to, like, nonergodic models,
00:17:53.610 --> 00:17:58.360
this is something that we – so this
is an unmodeled effect before.
00:17:58.360 --> 00:18:01.890
So now we’re modeling it. That means
we should have a smaller value
00:18:01.890 --> 00:18:08.120
of the noise. So we have a smaller
value of the aleatory variability.
00:18:09.620 --> 00:18:13.720
One big problem is that data is sparse.
Even though we have thousands of
00:18:13.720 --> 00:18:20.000
data points in California, and even
more in Japan, it’s still quite sparse
00:18:20.000 --> 00:18:26.500
to get a really good grip of those,
and especially at all possible locations
00:18:26.500 --> 00:18:29.700
that we want to look at.
So that means we have to kind of –
00:18:29.700 --> 00:18:33.360
we still have to aggregate
the data somehow.
00:18:33.360 --> 00:18:38.720
And what’s really, really
important is – for hazard is that
00:18:38.730 --> 00:18:43.140
we need to take into account
the adjustment in those things.
00:18:43.140 --> 00:18:47.720
We can estimate the site effect from,
like, five records at that site.
00:18:47.720 --> 00:18:51.550
And we can estimate it
from 100 records at that site.
00:18:51.550 --> 00:18:54.800
If we have 100 records, we’ll be
more certain about that site effect.
00:18:54.800 --> 00:18:57.440
And from five records,
we will be less certain.
00:18:57.440 --> 00:19:01.270
So that needs to be reflected
in the application.
00:19:01.270 --> 00:19:03.380
Right.
00:19:03.380 --> 00:19:08.840
So people have known
this for a long time.
00:19:08.840 --> 00:19:19.740
And they try to include that in, like,
recent models over the last few years.
00:19:19.740 --> 00:19:24.170
And one way is to make it –
to make some model broadly
00:19:24.170 --> 00:19:28.900
location-specific is to put in
some region-specific effects.
00:19:29.600 --> 00:19:34.640
So, for example, like in NGA,
NGA-West1 was global.
00:19:34.640 --> 00:19:41.580
NGA-West2 had adjustment coefficients
for Japan, California, and Taiwan and –
00:19:41.580 --> 00:19:44.570
I don’t know – I think – don’t think
they had adjustments for Europe.
00:19:44.570 --> 00:19:49.740
But you can basically do – you have
the same sort of model structure,
00:19:49.740 --> 00:19:56.600
but then you say, okay, if I’m in Japan,
I change my Vs30 scaling a bit because
00:19:56.600 --> 00:19:59.660
the site effects in Japan
are slightly different.
00:19:59.660 --> 00:20:04.159
If I’m in Taiwan, I also
change it a little bit.
00:20:04.159 --> 00:20:07.990
So what that typically looks like,
if you have some function for the
00:20:07.990 --> 00:20:12.370
median, is that you have some function
of magnitude, distance, and so on.
00:20:12.370 --> 00:20:16.230
And then you have some, let’s say,
global coefficient. And then you have
00:20:16.230 --> 00:20:20.140
some adjustment coefficient that
depends on the particular region.
00:20:20.140 --> 00:20:24.360
And people have been doing that
for a long time, and then this also
00:20:24.360 --> 00:20:26.470
should have some
sort of uncertainty.
00:20:26.470 --> 00:20:32.269
So we could usually model that as a
random effect or something like that.
00:20:32.269 --> 00:20:35.870
Another way is – and that’s, I think,
probably what a lot of people know
00:20:35.870 --> 00:20:43.710
is single-station sigma.
That means, instead of just adding
00:20:43.710 --> 00:20:49.790
or just aggregating data from all sites,
we’ll look at each site individually.
00:20:49.790 --> 00:20:53.910
And so even sites that have the
same Vs30 value might have –
00:20:53.910 --> 00:20:58.100
might lead to different
average predictions.
00:20:58.100 --> 00:21:04.299
So then we have some –
just some sort of site constant that is –
00:21:04.299 --> 00:21:08.880
that S term – that will be
specific to each particular site.
00:21:08.880 --> 00:21:12.650
And then also this actually
takes a good chunk of the
00:21:12.650 --> 00:21:17.020
aleatory variability
out of the sigma.
00:21:19.040 --> 00:21:23.800
A simple example for this.
If you just go further and further and
00:21:23.800 --> 00:21:30.570
just to show how that would work, is
example regionalization of California.
00:21:30.570 --> 00:21:35.730
So what this shows is event
terms from the ASK14 model,
00:21:35.730 --> 00:21:38.750
so that’s one of the
NGA-West2 models.
00:21:38.750 --> 00:21:44.640
And so that basically is, like,
an average shift for each event.
00:21:44.640 --> 00:21:49.120
And that’s plotted at
each hypocenter location.
00:21:49.120 --> 00:21:55.140
And what we can see – I don’t know
if you can see this, but on average,
00:21:55.150 --> 00:21:59.980
we have more bluish colors up here
and more reddish colors down there.
00:21:59.980 --> 00:22:06.490
And that means that, on average, we
might have slightly lower predictions in
00:22:06.490 --> 00:22:11.860
northern California and slightly larger
predictions in southern California.
00:22:11.860 --> 00:22:17.680
There’s probably a lot of other things
mapped into these event terms if we
00:22:17.690 --> 00:22:24.580
only have event terms, but right now,
this is just a simple example.
00:22:24.580 --> 00:22:29.980
So one way to just try to put that
into a model is to have some
00:22:29.980 --> 00:22:32.870
very, very simple regionalization.
That means we have a different
00:22:32.870 --> 00:22:36.700
constant for northern California
and southern California.
00:22:36.700 --> 00:22:42.240
That means I just put it as an example.
I say if the latitude of the hypocenter
00:22:42.240 --> 00:22:49.690
is larger than 36.5, it gets a different
constant than if the latitude –
00:22:49.690 --> 00:22:53.760
if the latitude is – so it’s different
if it’s in northern California
00:22:53.760 --> 00:22:55.630
and in southern California.
00:22:55.630 --> 00:23:00.270
And if you do this, then this
actually shows on the map.
00:23:00.270 --> 00:23:03.560
It shows the c-1 and c-2 value.
Then c-1 is, on average,
00:23:03.560 --> 00:23:08.610
a little bit negative.
And c-2 is just slightly positive.
00:23:08.610 --> 00:23:13.419
And then, on average, it works out that,
if you would redo these event terms
00:23:13.419 --> 00:23:17.799
using the regionalized models, you
don’t really see the strength anymore.
00:23:17.799 --> 00:23:21.740
That means we’ve kind of
taken care of that regional trend
00:23:21.740 --> 00:23:25.059
that we saw in
those event terms.
00:23:25.059 --> 00:23:30.610
One big problem with this, or one
problem with this, is, if we look back
00:23:30.610 --> 00:23:34.540
at this, we have mainly data here,
and we have mainly data there.
00:23:34.540 --> 00:23:39.280
So that means this whole thing is
kind of driven by data that’s here,
00:23:39.280 --> 00:23:43.620
and it’s driven by data that’s here.
But we sort of apply it everywhere –
00:23:43.620 --> 00:23:47.890
we would probably apply it
everywhere in California.
00:23:47.890 --> 00:23:53.720
So that means, are we really certain
that something – if an event happened
00:23:53.720 --> 00:23:58.360
somewhere around here,
that would be lower on average?
00:23:58.360 --> 00:24:02.150
Because it’s, like, mainly driven
from data in the Bay Area.
00:24:02.150 --> 00:24:05.300
And the other thing is,
36.5 is pretty arbitrary.
00:24:05.300 --> 00:24:09.710
Why is it not 36.4?
Why is not 36-something?
00:24:09.710 --> 00:24:15.820
We can maybe optimize for that.
But that’s not really – so this is kind of –
00:24:15.820 --> 00:24:19.720
it’s a simple model. It works.
It’s statistically – the difference
00:24:19.720 --> 00:24:24.440
is statistically significant.
It leads to a slightly smaller aleatory
00:24:24.440 --> 00:24:28.630
variability, so that makes us happy.
But it’s not really a good model,
00:24:28.630 --> 00:24:33.460
and it’s not really –
I just drew a line on the map.
00:24:33.460 --> 00:24:37.520
Why did I not drew this like this?
Why – well, this was easier.
00:24:37.520 --> 00:24:41.560
But, so it’s not really –
it’s not really a good model.
00:24:41.560 --> 00:24:45.300
The other thing is, okay, to take care
of that, maybe we can make this
00:24:45.309 --> 00:24:51.720
thing maybe dependent on geology.
Or maybe we can – if we don’t want to
00:24:51.720 --> 00:24:55.850
apply this everywhere in California,
we just have one region here, and we
00:24:55.850 --> 00:24:59.470
have more regions somewhere else.
But then this becomes –
00:24:59.470 --> 00:25:03.669
we make the region smaller.
That should give us a little bit better
00:25:03.669 --> 00:25:09.190
handle on those regional –
on the regionalization.
00:25:09.190 --> 00:25:15.049
But that’s not really that great.
So basically – and that was the reason
00:25:15.049 --> 00:25:21.660
why we came up with this varying
coefficient model is that – I mean,
00:25:21.660 --> 00:25:24.429
defining those regional
boundaries is kind of arbitrary.
00:25:24.429 --> 00:25:29.520
You can make it sort of based on
geology, but it’s also – it’s usually just,
00:25:29.520 --> 00:25:32.520
we draw a line on the map
based on where we have data
00:25:32.530 --> 00:25:34.600
and where we don’t
really have data.
00:25:34.600 --> 00:25:39.880
And then the other thing is, there’s –
I think there’s probably a trend to make
00:25:39.880 --> 00:25:43.799
it – if we have more and more data, we
can make the region smaller and smaller.
00:25:43.799 --> 00:25:47.520
But if we make the region smaller
and smaller and smaller and smaller,
00:25:47.520 --> 00:25:51.040
that means, at some point,
the regions become a point.
00:25:51.040 --> 00:25:55.549
That means we get something like
a continuous regionalization.
00:25:55.549 --> 00:26:00.500
And that is basically the
idea behind the varying coefficient
00:26:00.500 --> 00:26:07.039
model of Landwehr et al.
So that is a model where coefficients
00:26:07.040 --> 00:26:10.340
are a continuous
function of location.
00:26:10.900 --> 00:26:16.060
Location is a different topic.
What does location really mean?
00:26:16.060 --> 00:26:20.880
But that’s the way it is.
And I think Niels was –
00:26:20.880 --> 00:26:26.890
Niels Landwehr was a postdoc or
so in the machine learning department
00:26:26.890 --> 00:26:31.360
in Potsdam, and I knew him from there.
And then he sent me an email saying,
00:26:31.360 --> 00:26:36.280
oh, we’ve got this cool spatial model.
We think you’ve got earthquake data.
00:26:36.280 --> 00:26:39.500
That’s spatial data.
Let’s try it out on that.
00:26:39.500 --> 00:26:43.179
And so that’s where you came,
and then we brought Norm in.
00:26:43.179 --> 00:26:48.950
And that’s how we do this.
So just a very, very simple example.
00:26:48.950 --> 00:26:54.740
Typically, we would have,
let’s say, a ground motion model –
00:26:54.740 --> 00:26:57.220
not a very good one.
Arguably, it would look
00:26:57.230 --> 00:26:59.870
maybe something like this.
We have some constants.
00:26:59.870 --> 00:27:05.650
We have some beta coefficients, and
then we have some magnitude scaling.
00:27:05.650 --> 00:27:11.700
We have a geometric spreading term.
We have an anelastic attenuation term.
00:27:11.700 --> 00:27:16.860
And then the only thing that we
did in the varying coefficient model
00:27:16.860 --> 00:27:20.510
is make some of those coefficients
dependent on location.
00:27:21.280 --> 00:27:26.299
So that means, in this case,
t-e is the coordinate of the event,
00:27:26.299 --> 00:27:30.279
and t-s is the coordinate of the station.
And it means we have some
00:27:30.279 --> 00:27:32.940
coefficients – let’s say, like,
00:27:32.940 --> 00:27:37.279
we have constants here,
and we make those constants
00:27:37.279 --> 00:27:40.630
dependent on earthquake
location and site location.
00:27:40.630 --> 00:27:45.280
That means we have a different
constant at each point in California,
00:27:45.280 --> 00:27:48.340
depending on whether the –
if there’s an earthquake.
00:27:48.340 --> 00:27:52.710
We also make the – we can make
the geometrical spreading.
00:27:52.710 --> 00:27:58.420
So in the Landwehr paper, we made
that event-specific – dependent.
00:27:58.420 --> 00:28:01.770
And then we have the
anelastic attenuation,
00:28:01.770 --> 00:28:06.309
which should depend on both
event location and site location.
00:28:06.309 --> 00:28:10.910
Because it’s a path-specific term, right?
So it depends, if I have an earthquake
00:28:10.910 --> 00:28:16.090
here, and an earthquake at the site to
the west, and the site to the north,
00:28:16.090 --> 00:28:19.180
that should – the path is different,
so I should have a different
00:28:19.180 --> 00:28:23.330
anelastic attenuation.
We didn’t really do that in the original
00:28:23.330 --> 00:28:27.340
paper, but this is definitely something
where things can be improved.
00:28:27.340 --> 00:28:31.679
So basically, what this does is,
we take some function that is
00:28:31.679 --> 00:28:34.460
a function of magnitude, distance,
and some other parameters,
00:28:34.460 --> 00:28:39.920
and we make some of the coefficients a
function of event and station location.
00:28:39.920 --> 00:28:43.610
Which immediately raises
the question, which ones?
00:28:43.610 --> 00:28:47.870
And so, for that, we sort of said
we probably cannot do this for
00:28:47.870 --> 00:28:50.299
the magnitude scaling because
we don’t really have that many
00:28:50.299 --> 00:28:54.310
large-magnitude events.
So we don’t really want to do that.
00:28:54.310 --> 00:29:00.620
We think the constants should be fine.
We think – I didn’t put it up here,
00:29:00.620 --> 00:29:07.381
but there’s a difference in Vs30 scaling,
probably between northern – it might
00:29:07.381 --> 00:29:12.200
be between northern – so we make
that one location-dependent.
00:29:12.200 --> 00:29:16.220
So that’s a little bit of, like, judgment.
What can we do?
00:29:16.220 --> 00:29:21.340
What do we think? Which kind of
coefficients can we get to?
00:29:21.340 --> 00:29:25.320
The other thing is really,
how should this function look like?
00:29:25.320 --> 00:29:27.960
So it’s a function of location.
So that means, how should it
00:29:27.960 --> 00:29:31.080
go from northern California
to southern California?
00:29:31.080 --> 00:29:33.980
Is it a sine function?
Is it a polynomial?
00:29:33.980 --> 00:29:37.190
Is it exponential?
So we have no idea.
00:29:37.190 --> 00:29:42.610
And that’s basically the main part
of the varying coefficient model is
00:29:42.610 --> 00:29:47.490
that we assume that it is drawn
from a Gaussian process prior.
00:29:47.490 --> 00:29:51.549
And that’s some very nice – that has
some very nice properties that we can
00:29:51.549 --> 00:29:57.500
use and that we want to have here.
So basically, in Bayesian
00:29:57.500 --> 00:30:02.520
nonparametrics, a Gaussian
process prior is often used as
00:30:02.530 --> 00:30:06.559
a prior over functions.
So that means, if you have – if you’re
00:30:06.559 --> 00:30:09.570
familiar with Bayesian, you often have –
you have a prior distribution for
00:30:09.570 --> 00:30:12.860
some coefficient, and then you
have a posterior distribution.
00:30:12.860 --> 00:30:18.020
Here, what we have is, we have
a distribution over functions.
00:30:18.020 --> 00:30:23.049
And then, once we observe data, we get
some sort of posterior distribution over
00:30:23.049 --> 00:30:28.419
functions, which makes – which allows
us to quantify them a little bit.
00:30:28.419 --> 00:30:33.309
So this is one example –
just one-dimensional example.
00:30:33.309 --> 00:30:39.050
So here that would be three different
functions drawn from this Gaussian
00:30:39.050 --> 00:30:43.779
process prior in the – if we
don’t really have any data at all.
00:30:43.779 --> 00:30:47.039
And then this is, like,
the width of our prior.
00:30:47.039 --> 00:30:50.450
So we think, okay, this is plus/minus
one standard deviation.
00:30:50.450 --> 00:30:55.650
So if I had, like, 1,000 realizations
and looked at the standard deviation
00:30:55.650 --> 00:31:00.940
at every point, they would sort of
mostly fall in this range around zero.
00:31:00.940 --> 00:31:08.140
Once we observe data, like here, here,
and here, we get a posterior distribution.
00:31:08.140 --> 00:31:13.559
And, depending a bit on how we set it
up, now we have – now we see that all
00:31:13.559 --> 00:31:18.410
the functions go through those points.
Once we’re far away from those points,
00:31:18.410 --> 00:31:24.659
they kind of can still do whatever
they want, but they all go through
00:31:24.659 --> 00:31:29.620
the data points that we observe. And
that is really a nice property that we
00:31:29.620 --> 00:31:35.750
have – that we can use for nonergodic.
If we’re close to some data,
00:31:35.750 --> 00:31:39.020
then we know something about that.
Then we should know about
00:31:39.020 --> 00:31:41.880
something about how those
coefficients should scale
00:31:41.880 --> 00:31:45.670
or should behave close
to that data point.
00:31:45.670 --> 00:31:50.700
And whereas, if we go far away,
they can still do whatever they want.
00:31:50.700 --> 00:31:54.920
And that is – and so this also
shows the posterior uncertainty.
00:31:54.920 --> 00:31:58.330
And then, in this case,
we also see, close to data,
00:31:58.330 --> 00:32:00.540
we have very, very
small uncertainty.
00:32:00.540 --> 00:32:06.830
But the farther we go away from the
data, we have really large uncertainty.
00:32:06.830 --> 00:32:10.850
And that is exactly what we
want for nonergodic models.
00:32:10.850 --> 00:32:14.850
Because, if we have some data,
then we can – then we think
00:32:14.850 --> 00:32:17.510
we know something.
If there’s no data – if we go
00:32:17.510 --> 00:32:21.020
to northeastern California
where we don’t have data,
00:32:21.020 --> 00:32:24.140
we don’t know anything.
So the uncertainty needs to be large.
00:32:24.140 --> 00:32:28.020
So that means, in this case,
we’re kind of maybe here.
00:32:28.020 --> 00:32:31.520
So we – like, really,
really big uncertainty.
00:32:32.740 --> 00:32:37.840
This is how it looks like,
and so I decided to put this in.
00:32:37.850 --> 00:32:41.460
Might look intimidating at first,
but it really isn’t.
00:32:41.460 --> 00:32:46.060
So what we have is, we have some of
those coefficients, like beta-i, like the
00:32:46.060 --> 00:32:51.800
constant, or anelastic continuation –
is a function of location.
00:32:51.809 --> 00:32:54.480
And the way we write this,
we put a Gaussian process
00:32:54.480 --> 00:32:58.320
prior over that
particular coefficient.
00:32:58.320 --> 00:33:02.140
And then Gaussian process
prior is defined by a mean function
00:33:02.140 --> 00:33:04.120
and a covariance function.
00:33:04.120 --> 00:33:06.340
And the covariance
function is this.
00:33:06.340 --> 00:33:11.960
And we use these two –
or, this covariance function
00:33:11.960 --> 00:33:18.770
for all the coefficients in our model.
So what that basically means is that
00:33:18.770 --> 00:33:24.140
we have – for those coefficients
that are not spatially varying,
00:33:24.140 --> 00:33:26.240
we put a constant in there.
00:33:26.240 --> 00:33:30.080
And, for those that
are spatially varying,
00:33:30.080 --> 00:33:34.779
that actually depends on the
distance between two points.
00:33:34.779 --> 00:33:36.500
That covariance matrix.
00:33:36.500 --> 00:33:38.510
And that is all function.
00:33:38.510 --> 00:33:42.350
That means, if these two points
are really, really close together,
00:33:42.350 --> 00:33:45.850
then their correlation
is pretty – is large.
00:33:45.850 --> 00:33:51.149
And then, if we go one step further and
say, how was the data generated, well,
00:33:51.149 --> 00:33:54.240
the data is generated
from a normal distribution.
00:33:54.240 --> 00:33:58.450
So all our records are generated from
normal distribution, which has
00:33:58.450 --> 00:34:05.910
a mean zero. And all the modeling
goes on in the covariance function.
00:34:05.910 --> 00:34:10.109
So we have the covariance function,
which is defined by this covariant
00:34:10.109 --> 00:34:14.089
or this covariance matrix, which is
defined by this covariance function.
00:34:14.089 --> 00:34:17.079
And then we also have some
noise term on the diagonal.
00:34:17.079 --> 00:34:21.319
Because there will always be noise.
We don’t really – what this actually
00:34:21.319 --> 00:34:27.179
means – so I don’t really want to go
into this, but this constant here means
00:34:27.179 --> 00:34:32.679
that we have a linear model in the
coefficients that do not vary spatially.
00:34:32.679 --> 00:34:36.450
And what this actually encodes is
that the spatially varying coefficients
00:34:36.450 --> 00:34:42.319
are correlated. That means the closer
to locations are, the stronger there is
00:34:42.319 --> 00:34:46.970
a correlation between the
values of those coefficients.
00:34:46.970 --> 00:34:52.259
And that actually allows us to do some
aggregation of data, and that – because,
00:34:52.260 --> 00:34:55.740
if you think about it,
it’s a continuously varying model.
00:34:55.740 --> 00:34:59.059
We have infinitely
many data locations.
00:34:59.059 --> 00:35:01.069
But we don’t really have
the data to do that.
00:35:01.069 --> 00:35:04.680
So that means, by assuming –
or, by modeling them as
00:35:04.680 --> 00:35:09.230
spatially correlated, or modeling
them with the GP prior, we can
00:35:09.230 --> 00:35:13.700
actually aggregate some data
that is sort of close together.
00:35:15.620 --> 00:35:20.160
How that actually looks like –
looks like compared to a typical
00:35:20.160 --> 00:35:23.940
regression is that, say,
in a typical regression, let’s say
00:35:23.940 --> 00:35:27.660
we observe Y-1 to Y-N.
That’s our observed data.
00:35:27.660 --> 00:35:32.640
Let’s say log PGA
at N different records.
00:35:32.650 --> 00:35:37.180
And then we typically have a normal
distribution, and the mean is some
00:35:37.180 --> 00:35:40.359
function depending on magnitude,
distance, and so on,
00:35:40.359 --> 00:35:45.369
and some coefficients.
And then we just have the covariance
00:35:45.369 --> 00:35:49.920
matrix for that normal distribution.
It’s just a diagonal matrix with –
00:35:49.920 --> 00:35:55.160
actually that should be sigma-squared –
with the sigma-squared on the diagonal.
00:35:55.160 --> 00:36:02.979
And on the non-diagonal, the entries
of that covariance matrix are zero.
00:36:02.979 --> 00:36:09.450
On the other hand, if we – in a typical
GMPE, that looks pretty similar.
00:36:09.450 --> 00:36:15.540
The only thing is, now we replace sigma
with phi-squared plus tau-squared.
00:36:15.540 --> 00:36:19.940
And then we have some off –
we can have off-diagonal elements.
00:36:19.940 --> 00:36:24.160
And in a typical GMPE, that’s
just accounting for event terms.
00:36:24.160 --> 00:36:30.000
That means, if we have – if the i-th
and j-th record are from the same event,
00:36:30.009 --> 00:36:34.089
we have a tau-squared on the –
in the covariance matrix.
00:36:34.089 --> 00:36:37.359
And then the only thing that
we’re doing, we’re just making
00:36:37.359 --> 00:36:42.119
this one a lot more complicated.
So now, in the varying coefficient
00:36:42.119 --> 00:36:48.380
GP model, we’re just moving
everything to the covariance matrix –
00:36:48.380 --> 00:36:52.450
into the covariance matrix of the data.
We still assume that all of our
00:36:52.450 --> 00:36:57.319
observed data is a
multivariate normal distribution.
00:36:57.319 --> 00:37:00.720
But now we have – we assume that
has a mean zero, and all the
00:37:00.720 --> 00:37:04.840
modeling goes on in
our covariance matrix.
00:37:06.380 --> 00:37:10.340
There’s probably some sort of step in
between, and I think that’s probably
00:37:10.349 --> 00:37:15.040
the way to go is to put some
part back into the mean function.
00:37:15.040 --> 00:37:19.760
But still have some sort of
complicated model in the covariance
00:37:19.760 --> 00:37:27.780
function which accounts for spatial
correlation between some coefficients
00:37:27.789 --> 00:37:31.109
or some effects that
we want to model.
00:37:31.109 --> 00:37:35.749
And then we still need something –
and we can make the covariance
00:37:35.749 --> 00:37:38.469
matrix scale
as complicated as we want.
00:37:38.469 --> 00:37:43.809
And, depending on how we model –
how we set it up, we can model
00:37:43.809 --> 00:37:47.839
different things. But this is – this is
really something that we need to think
00:37:47.840 --> 00:37:54.520
about, and the next models will not
look exactly like the previous ones.
00:37:55.780 --> 00:37:58.809
So this is – again, this is just the
model that we had, so we have
00:37:58.809 --> 00:38:06.229
some sort of coefficients that
we make spatially correlated.
00:38:06.229 --> 00:38:12.640
And then we set it up that each of them
is – has this sort of covariance function.
00:38:12.640 --> 00:38:16.280
And what actually turns out is that
the parameters of the models that
00:38:16.280 --> 00:38:22.579
we estimate are only the parameters
of the covariance functions.
00:38:22.579 --> 00:38:27.970
So we only need the variances and
the length scales of our covariance
00:38:27.970 --> 00:38:31.760
functions to make predictions.
That means we do not estimate
00:38:31.760 --> 00:38:36.150
the individual coefficients.
We do not estimate beta-1, beta-2.
00:38:36.150 --> 00:38:39.599
We do not estimate
beta-zero at each location.
00:38:39.600 --> 00:38:45.920
We don’t need that to do prediction,
and so this model is a very –
00:38:45.920 --> 00:38:48.869
it’s geared to what’s
prediction of new data points.
00:38:48.869 --> 00:38:53.499
To predict new data – to make
new predictions, the only thing
00:38:53.500 --> 00:39:00.420
you need to know are the parameters
of the covariance function.
00:39:01.020 --> 00:39:05.520
So a quick word
about prediction.
00:39:05.529 --> 00:39:11.819
If you know your multivariate normal
distribution, then it’s pretty easy to do.
00:39:11.820 --> 00:39:15.820
We just condition – if we want to
make a prediction, add a new, let’s say,
00:39:15.820 --> 00:39:19.480
magnitude-distance scenario
at a new location, we just
00:39:19.489 --> 00:39:24.319
condition on observed data. And that
basically means we just write it out.
00:39:24.319 --> 00:39:29.989
We have our observed data, and then
we have our new data point that
00:39:29.989 --> 00:39:36.369
we want to predict. That’s just – again,
that’s a normal distribution. And it has
00:39:36.369 --> 00:39:41.359
this covariance. So this is just the
covariance of the observed data.
00:39:41.359 --> 00:39:44.410
We can calculate that.
We know all of that.
00:39:44.410 --> 00:39:49.900
This is the covariance between the
new data point and the observed data.
00:39:49.900 --> 00:39:53.960
Then only depends on location
and magnitude, distance, and so on.
00:39:53.960 --> 00:39:55.970
So that’s pretty easy to do.
00:39:55.970 --> 00:40:01.269
And that’s the covariance of
the new data point with itself.
00:40:01.269 --> 00:40:08.329
And then it turns out that, if you do that,
then you can calculate the mean
00:40:08.329 --> 00:40:12.949
and the variance of the – of Y-star.
That means the mean of the variance
00:40:12.949 --> 00:40:17.410
of the ground motion distribution at
that new set of predictor variables,
00:40:17.410 --> 00:40:24.329
which is – I think it’s – the mean is just
this times – so this covariance times
00:40:24.329 --> 00:40:31.449
this covariance times the observed data.
And if you want to know the variance
00:40:31.449 --> 00:40:36.460
of this one, then it’s just this
minus this times this times this.
00:40:36.460 --> 00:40:41.059
That’s pretty – I mean,
it’s very easy to calculate.
00:40:41.059 --> 00:40:45.440
It can take a while to compute,
depending on how large these
00:40:45.440 --> 00:40:50.840
matrices are, but that’s all
there is to it – there is to it.
00:40:51.000 --> 00:40:57.180
So then, results. So this is just
taken from Landwehr et al.
00:40:57.199 --> 00:41:01.670
I think there’s a lot of room for
improvement for that particular model.
00:41:01.670 --> 00:41:05.520
So what we see – what we see here is,
on the left is the – are the median
00:41:05.520 --> 00:41:11.540
predictions for a magnitude 6
and a distance of 20 kilometers
00:41:11.549 --> 00:41:17.329
all over California.
And I think, just for simplicity,
00:41:17.329 --> 00:41:22.099
we set the station location the
same as the earthquake location,
00:41:22.099 --> 00:41:25.849
which is not really realistic,
but it made it easy to plot.
00:41:25.849 --> 00:41:29.779
And on the right we have the
associated epistemic uncertainty.
00:41:29.779 --> 00:41:33.839
So that is just calculated by
conditioning on the data and then
00:41:33.839 --> 00:41:36.979
just moving through all of
California and then calculating
00:41:36.979 --> 00:41:39.309
a prediction at each point.
00:41:39.309 --> 00:41:45.579
What we see is, we see – there is
quite a very – quite a large variation
00:41:45.579 --> 00:41:50.059
in predictions across California.
We have some lower predictions
00:41:50.059 --> 00:41:52.499
in northern California.
We have some larger predictions
00:41:52.499 --> 00:41:56.731
in southern California.
But we also see that, once we get to
00:41:56.731 --> 00:42:00.589
northeast [inaudible] on the regions
where we don’t really have a lot of data,
00:42:00.589 --> 00:42:05.489
we sort of go back to – so we don’t
really see those big effects anymore.
00:42:05.489 --> 00:42:08.180
And that’s because we
don’t have data there.
00:42:08.180 --> 00:42:12.599
That means we don’t – we cannot
really constrain different predictions.
00:42:12.599 --> 00:42:15.259
So we just revert back to
some sort of mean.
00:42:15.260 --> 00:42:20.280
We also see this in the
associated uncertainty.
00:42:21.020 --> 00:42:26.510
That is small here and here. So that’s
small everywhere where we have data.
00:42:26.510 --> 00:42:30.089
And it’s large everywhere
else where we don’t have data.
00:42:30.089 --> 00:42:35.059
And that kind of coincides nicely
with this because, well, on average,
00:42:35.059 --> 00:42:40.569
our adjustment – our adjustments to a
median prediction will be zero there.
00:42:40.569 --> 00:42:43.529
But that doesn’t mean that
those adjustments don’t exist.
00:42:43.529 --> 00:42:45.999
We just don’t know
how large they are.
00:42:45.999 --> 00:42:50.680
That means we need to put these –
we need to account that by having
00:42:50.680 --> 00:42:55.069
a very, very large
epistemic uncertainty.
00:42:55.069 --> 00:43:00.280
We also have coefficients.
And I said that we don’t estimate
00:43:00.280 --> 00:43:04.739
coefficients, but the way these
coefficients – we can still calculate
00:43:04.740 --> 00:43:09.260
them, and the way these coefficients
are calculated is by actually just
00:43:09.260 --> 00:43:13.880
generating synthetic data
at all – at different locations
00:43:13.880 --> 00:43:16.619
and then fitting
a linear model to that.
00:43:16.619 --> 00:43:20.700
And that gives us – so, because
the predictions are different,
00:43:20.700 --> 00:43:24.119
that will give us a different set
of coefficients [inaudible].
00:43:24.119 --> 00:43:25.959
So, on the left,
we have a constant.
00:43:25.959 --> 00:43:29.459
And, on the right, we have
the geometrical spreading.
00:43:29.459 --> 00:43:34.819
And, as you can see,
these things are correlated.
00:43:34.819 --> 00:43:40.420
Annemarie makes that point all the
time that these things go together.
00:43:40.420 --> 00:43:43.660
And that is true. So we can’t
just say, oh, yeah, this is –
00:43:43.660 --> 00:43:47.000
we just – oh, we have this constant.
We just apply this.
00:43:47.000 --> 00:43:51.940
No. At the same time, we – if we make
the constant larger, we have to make the
00:43:51.941 --> 00:43:55.989
geometrical spreading a little bit smaller.
Those two things go together.
00:43:55.989 --> 00:43:58.580
These things go
together all the time.
00:43:58.580 --> 00:44:05.140
So, yeah, I mean, that’s just –
but the way we then use this is
00:44:05.140 --> 00:44:09.440
we use these coefficients,
calculated from synthetic data or
00:44:09.460 --> 00:44:15.580
from model predictions, we use these
coefficients later in the hazard run.
00:44:16.180 --> 00:44:19.780
The one thing that’s
not really well done
00:44:19.780 --> 00:44:23.900
in the VCM is the
anelastic attenuation.
00:44:24.780 --> 00:44:29.979
Initially, we had the anelastic
attenuation as a function of event
00:44:29.979 --> 00:44:34.650
location, which means it's isotropic
in all directions, which is certainly
00:44:34.650 --> 00:44:41.430
not correct. We had written a paper.
It was pretty much finished.
00:44:41.430 --> 00:44:46.009
And then I think Frank Scherbaum,
my old adviser, said, why don’t you
00:44:46.009 --> 00:44:49.239
just use the midpoint between
earthquake and station location?
00:44:49.239 --> 00:44:52.459
That actually is a better thing
to do instead of making it –
00:44:52.459 --> 00:44:55.509
but it was way too late
to actually change that.
00:44:55.509 --> 00:44:59.529
But that would be a better way to do it.
But ideally, it should depend on
00:44:59.529 --> 00:45:03.160
the event and station location.
That means if we have an event
00:45:03.160 --> 00:45:08.920
here, and – I don’t know.
So, if we have an event and a station,
00:45:08.920 --> 00:45:11.680
and if we have the same event
and the station is somewhere else,
00:45:11.680 --> 00:45:15.459
we should get a different
value for that coefficient.
00:45:15.459 --> 00:45:20.209
The problem is that that makes
this a little bit – you can still do it,
00:45:20.209 --> 00:45:23.969
but it makes it, like,
four-dimensional.
00:45:23.969 --> 00:45:25.979
Because you have
four different locations.
00:45:25.979 --> 00:45:30.039
And it’s still not really great.
There are still some other problems.
00:45:30.039 --> 00:45:33.339
And – because – and then –
so we decided not to actually
00:45:33.339 --> 00:45:37.140
do this for the hazard runs.
And because we think, overall,
00:45:37.140 --> 00:45:42.390
the anelastic attenuation should
actually depend on the path traveled.
00:45:42.390 --> 00:45:48.210
So what we then did, we used
a model that was introduced by
00:45:48.210 --> 00:45:52.930
Dawood and Rodriguez-Marek.
And we used a cell-specific attenuation
00:45:52.930 --> 00:45:56.470
model. That’s very, very simple to do.
It’s very simple to calculate.
00:45:56.470 --> 00:46:02.180
Basically, what the [inaudible] of this
is the – or, this is the event location.
00:46:02.180 --> 00:46:05.150
That’s the station.
00:46:05.150 --> 00:46:08.259
They just divide the
whole space where they are
00:46:08.259 --> 00:46:10.979
into different cells
of a certain size.
00:46:10.979 --> 00:46:15.150
And then you just calculate the path
traveled in each of those cells.
00:46:15.150 --> 00:46:20.969
And then each cell gets a
different attenuation coefficient.
00:46:20.969 --> 00:46:25.269
And then you just sum the
attenuation coefficient for that cell
00:46:25.269 --> 00:46:27.859
times the path
traveled in that cell.
00:46:27.859 --> 00:46:33.559
It’s very easy to do. It’s very easy
to calculate those distances.
00:46:33.559 --> 00:46:38.630
And it’s not really that complicated
to get the – get those coefficients.
00:46:38.630 --> 00:46:42.890
In an ergodic model,
this would just be beta-a times R.
00:46:42.890 --> 00:46:46.009
So this would just be
one coefficient everywhere.
00:46:46.009 --> 00:46:50.069
And if you go from here to there, it’s the
same attenuation if you go from here to
00:46:50.069 --> 00:46:55.220
here. It’s always the same attenuation,
which is certainly not correct.
00:46:56.040 --> 00:46:59.819
And then what we did is we extended
the cell-specific model because
00:46:59.819 --> 00:47:05.619
one thing that we really wanted to
have is uncertainty in those cell-specific
00:47:05.619 --> 00:47:11.150
coefficients. Because that’s really,
really important to carry that through.
00:47:11.150 --> 00:47:15.890
So they didn’t do that
in the original paper.
00:47:15.890 --> 00:47:21.190
It was also – it was done for Japan
and subduction data, so we needed to
00:47:21.190 --> 00:47:24.720
have it for California.
So we made the whole thing –
00:47:24.720 --> 00:47:28.880
we modeled everything as a Bayesian
model, and we modeled the cell-specific
00:47:28.890 --> 00:47:33.599
coefficients as random effects,
which is – which allows us to nicely
00:47:33.599 --> 00:47:37.880
keep track of the associated uncertainty.
So basically, what we have – each of
00:47:37.880 --> 00:47:44.189
those cell-specific coefficients
is assumed to be normally distributed
00:47:44.189 --> 00:47:48.630
with a certain mean. That would be,
like, an average California attenuation
00:47:48.630 --> 00:47:53.099
and a certain standard deviation.
And that is, like, the range of
00:47:53.099 --> 00:47:56.670
possible attenuation
coefficients across California.
00:47:56.670 --> 00:48:05.900
We also make it truncated so that
we don’t get – that we don’t get
00:48:05.900 --> 00:48:11.800
coefficients that are larger than zero.
What we see here is, on the left is all
00:48:11.809 --> 00:48:17.959
the paths, and so that’s the stations
in red and the events in blue and all the
00:48:17.959 --> 00:48:24.609
paths in the ASK14 data sets
for the – for the NGA-West2 model.
00:48:24.609 --> 00:48:29.319
And on the right we see how many
paths are traveled through each cell.
00:48:29.319 --> 00:48:33.650
We have lots of paths here
and not really that many
00:48:33.650 --> 00:48:36.760
in northeastern
California, and so on.
00:48:37.300 --> 00:48:42.760
And then, this is how that looks like.
On the left, we have the average
00:48:42.769 --> 00:48:47.239
difference to the mean.
So that would be beta minus mu-beta.
00:48:47.239 --> 00:48:52.640
So that’s – so we see, where we actually
do have data, we do change the mean.
00:48:52.640 --> 00:48:56.580
So we do see differences to,
let’s say, the average coefficient.
00:48:56.580 --> 00:49:01.759
Whereas, where we don’t have data,
we don’t really – it all just falls back
00:49:01.759 --> 00:49:06.150
towards the mean. So the adjustment
is pretty much zero there.
00:49:06.150 --> 00:49:09.640
And that corresponds,
and that always goes together.
00:49:09.640 --> 00:49:14.650
We have strong systematic effects.
It means we have – but we also are
00:49:14.650 --> 00:49:17.200
more certain about that
because we have data there.
00:49:17.200 --> 00:49:20.109
Whereas, if we don’t have data,
then we have really, really
00:49:20.109 --> 00:49:24.340
large uncertainty
in those coefficients.
00:49:24.340 --> 00:49:26.619
What you can see here is –
I don’t know if you can,
00:49:26.619 --> 00:49:29.779
but there’s a –
these two little symbols.
00:49:29.779 --> 00:49:35.489
So we then calculated predictions
for paths from here to here and the
00:49:35.489 --> 00:49:39.859
path from here to here.
So one path through cells where
00:49:39.859 --> 00:49:43.440
we have lots of uncertainty,
and one path through cells where
00:49:43.440 --> 00:49:48.099
we have very little uncertainty
because there’s a lot of data there.
00:49:48.099 --> 00:49:51.829
They’re all – they have this – you know,
same length, so they are –
00:49:51.829 --> 00:49:56.460
I think it’s 100 kilometers.
And these are the predictions of those.
00:49:56.460 --> 00:50:01.900
So on the left, we have the predictions
for the paths that goes through here.
00:50:01.900 --> 00:50:04.859
And on the right,
we have the path that goes
00:50:04.859 --> 00:50:07.439
here somewhere in
southern California.
00:50:07.439 --> 00:50:15.989
And what we can see is – so this is the
posterior distribution of the predictions.
00:50:15.989 --> 00:50:20.099
So that means we get out of the model,
we get some posterior samples.
00:50:20.099 --> 00:50:24.729
We get, like, 1,000 different samples of
coefficients that cover the distribution
00:50:24.729 --> 00:50:28.130
and the uncertainty in each cell.
And then we calculate for each of
00:50:28.130 --> 00:50:31.140
those cells, and we
calculate the median prediction.
00:50:31.140 --> 00:50:36.299
And that – and that gives us an
estimate of the epistemic uncertainty
00:50:36.299 --> 00:50:40.559
of our predictions just based on
that particular path.
00:50:40.559 --> 00:50:46.010
And that is very, very wide on the left,
and it’s very narrow on the right.
00:50:46.010 --> 00:50:49.189
But it’s also very
different from the mean.
00:50:49.189 --> 00:50:54.529
What we also have here is – so this
one is calculated using the average
00:50:54.529 --> 00:51:00.000
coefficient. So that’s a
prediction using this new beta.
00:51:00.000 --> 00:51:05.100
And so that’s the same in both
cases because it’s 100 kilometers.
00:51:05.109 --> 00:51:14.210
And then this is the prediction using the
ASK14 anelastic attenuation coefficient.
00:51:14.210 --> 00:51:18.739
That one is different from this one,
so even though we kind of use –
00:51:18.739 --> 00:51:23.119
we use all the other coefficients
from ASK14, so that’s what we have
00:51:23.120 --> 00:51:29.780
is an adjustment – an anelastic
adjustment to the ASK14 model.
00:51:29.780 --> 00:51:35.740
But we actually – our average
coefficient for California is different
00:51:35.740 --> 00:51:42.220
from the one that they have.
And the reason for that is that the
00:51:42.220 --> 00:51:45.320
ASK14, that’s an ergodic model.
That means it’s aggregated – it’s
00:51:45.320 --> 00:51:50.520
aggregating all data from California.
And that means we have – theirs has
00:51:50.529 --> 00:51:54.719
more data from southern California
than it does from northern California.
00:51:54.719 --> 00:52:01.119
And that means the ASK14 anelastic
attenuation coefficient is sort of
00:52:01.119 --> 00:52:05.749
biased toward southern California.
And if there are strong differences
00:52:05.749 --> 00:52:11.260
between southern California and
northern California, then it won’t really
00:52:11.260 --> 00:52:14.650
work that well in northern California.
On the other hand, if you model
00:52:14.650 --> 00:52:20.499
everything as a random effect,
that gets taken care of.
00:52:20.499 --> 00:52:23.720
And what we have – we have
an average over all those cells.
00:52:23.720 --> 00:52:25.440
And then the cells are different.
00:52:25.440 --> 00:52:32.210
So that’s why we get a slightly different
average coefficient of California.
00:52:32.210 --> 00:52:36.829
If we do – if we weight the anelastic
attenuation coefficient of each cell
00:52:36.829 --> 00:52:41.719
by the number of cells and then
calculate the weighted average,
00:52:41.719 --> 00:52:48.859
we get the same – the same value,
pretty much, as ASK14 out.
00:52:48.859 --> 00:52:58.329
And then, basically, the only thing that
was done in Norm’s paper for 2019 was
00:52:58.329 --> 00:53:02.609
just combining the varying coefficient
model and the cell-specific attenuation
00:53:02.609 --> 00:53:07.729
model to calculate site-specific
nonergodic seismic hazard
00:53:07.729 --> 00:53:10.279
for three different
sites in PSHA.
00:53:10.279 --> 00:53:14.920
So I say the only thing.
Took two years to write that paper.
00:53:14.920 --> 00:53:21.600
So – and to get all the calculations right.
But it’s really not that – theoretically,
00:53:21.600 --> 00:53:25.820
it’s not that complicated.
So what we took from the varying
00:53:25.820 --> 00:53:30.360
coefficient model is we have an
event location-specific constant
00:53:30.369 --> 00:53:34.880
and geometrical spreading.
We have a site location-specific
00:53:34.880 --> 00:53:41.299
constant and the Vs30 scaling
adjustment coefficient that is –
00:53:41.299 --> 00:53:45.209
that comes out of Vs30.
And we also take the epistemic
00:53:45.209 --> 00:53:51.079
uncertainty from Landwehr to –
and we apply that to the –
00:53:51.080 --> 00:53:55.900
to the constant. That was just –
that’s not really well handled.
00:53:55.900 --> 00:54:01.580
Because it’s really complicated to
actually get the uncertainty of the –
00:54:01.599 --> 00:54:08.380
of individual coefficients. So we just
put all the uncertainty into the constant.
00:54:08.380 --> 00:54:12.069
And then, from the cell-specific model,
we just took – we took the cell-specific
00:54:12.069 --> 00:54:17.150
attenuation coefficients,
and we estimated different ones.
00:54:17.150 --> 00:54:22.540
Because there will be some tradeoff with
the geometrical spreading coefficient.
00:54:22.540 --> 00:54:25.940
So we took – we actually calculated
five different sets of cell-specific
00:54:25.949 --> 00:54:31.349
attenuation coefficients to account for
uncertainty in the geometrical spreading
00:54:31.349 --> 00:54:35.660
of the base model. And then the
base model is just using, like,
00:54:35.660 --> 00:54:41.359
an aggregate model from NGA-West2.
So we took the five NGA-West2 model,
00:54:41.360 --> 00:54:47.580
averaged them, sort of,
and then accounted for there within –
00:54:47.580 --> 00:54:50.671
accounted for the between
model uncertainty to get some
00:54:50.680 --> 00:54:54.029
sort of epistemic uncertainty
in the background model.
00:54:54.900 --> 00:55:00.180
And now, for site-specific hazard,
that is pretty – and that’s all you need,
00:55:00.189 --> 00:55:05.529
and that’s pretty easy, then,
to put that into hazard calculations.
00:55:05.529 --> 00:55:11.349
So what we did is, if we have a site –
let’s say the site in San Jose.
00:55:11.349 --> 00:55:14.780
We just put a grid – I don’t know –
I think it was, like,
00:55:14.780 --> 00:55:19.839
200 by 200 kilometers around that
site and have lots of grid points.
00:55:19.839 --> 00:55:25.890
And for each grid point, we calculate –
we find the geometrical spreading
00:55:25.890 --> 00:55:30.900
coefficient from the VCM.
The VCM has points all over California.
00:55:30.900 --> 00:55:35.019
So we find that particular value
for one particular grid point.
00:55:35.019 --> 00:55:40.440
And then, for those same grid points
around the site, we randomly sample
00:55:40.440 --> 00:55:46.769
100 different constants from the
distribution that was implied by
00:55:46.769 --> 00:55:50.160
the varying coefficient model.
So varying – it gives us – for each
00:55:50.160 --> 00:55:56.029
of those grid points, it gives us the
constant adjustment and an uncertainty.
00:55:56.029 --> 00:56:01.479
So we randomly sample that uncertainty
100 times so that we have some –
00:56:01.480 --> 00:56:07.440
so that we carry that uncertainty
through to the end hazard curves.
00:56:07.440 --> 00:56:13.220
And so that basically means we have,
like, 100 different branches on the
00:56:13.230 --> 00:56:19.279
logic tree, each with a set of coefficients
for all of our grid points.
00:56:19.279 --> 00:56:23.959
And then, for the site location,
we just find the Vs30 scaling coefficient
00:56:23.959 --> 00:56:29.559
from the VCM, apply that there, and we
also have a – like, a site constant from
00:56:29.559 --> 00:56:35.390
the VCM, together with an uncertainty,
and we randomly sample
00:56:35.390 --> 00:56:38.760
100 site [inaudible].
So we randomly sample 100 site
00:56:38.760 --> 00:56:43.000
constants for that particular site
and also put that on the logic tree.
00:56:43.000 --> 00:56:47.529
That means we have, again,
100 different – 100 different
00:56:47.529 --> 00:56:50.560
sample site terms that sample
the uncertainty and the
00:56:50.560 --> 00:56:53.839
site adjustments for
that particular site.
00:56:53.839 --> 00:56:57.450
And then, for all the grid points
around the site, we also – we calculate
00:56:57.450 --> 00:57:04.720
the distance – the cell-specific distances
to the site, calculate – and then calculate
00:57:04.720 --> 00:57:08.980
the anelastic attenuation term together
with the cell-specific constant.
00:57:08.980 --> 00:57:13.060
And then that means, for each grid point
that we have – that we have, let’s say,
00:57:13.060 --> 00:57:19.819
200 by 200, every 5 kilometers –
I don’t know how much that is
00:57:19.819 --> 00:57:23.079
right now, but then, for each of those
grid points, we have an adjustment.
00:57:23.079 --> 00:57:29.420
And that – and then, if we – if we want
to calculate the median prediction for
00:57:29.420 --> 00:57:36.189
a source, we just – for one particular
source, we find the grid point that is
00:57:36.189 --> 00:57:41.549
closest to the source, and then just add
the adjustments that we had previously
00:57:41.549 --> 00:57:45.680
calculated to the
median prediction there.
00:57:45.680 --> 00:57:53.039
So that’s pretty easy for a site-specific.
If you want to repeat that for many
00:57:53.040 --> 00:57:56.940
different sites, that’s going to
be more complicated.
00:57:57.829 --> 00:58:01.700
So that’s basically just – it’s actually
not from the Landwehr paper,
00:58:01.710 --> 00:58:10.019
but this is kind of what this shows.
So what we see here is, we see –
00:58:10.019 --> 00:58:18.219
this is actually the data that we have.
And these are sort of like the constants –
00:58:18.220 --> 00:58:23.100
the constants that we estimate
for each source location.
00:58:23.100 --> 00:58:27.390
So for each location, we estimate, oh,
this is actually the constant that we
00:58:27.390 --> 00:58:30.660
need to apply based on
the model that we have.
00:58:30.660 --> 00:58:36.949
And then we go through this grid
here and calculate the mean of those
00:58:36.949 --> 00:58:41.969
constants at all those grid points,
conditioned on what we observe
00:58:41.969 --> 00:58:45.219
at the – at the data
points that we have.
00:58:45.219 --> 00:58:49.710
And then, what we see is here.
So here we have negative values.
00:58:49.710 --> 00:58:55.009
And that means all the points
close to those negative constants
00:58:55.009 --> 00:58:59.309
that we have from the model
also get a negative adjustment –
00:58:59.309 --> 00:59:01.700
a negative
average adjustment.
00:59:01.700 --> 00:59:08.140
And then same here. We get a slightly
positive or more zero adjustment.
00:59:08.140 --> 00:59:11.839
And then, the farther away we go
from the data, the adjustment terms
00:59:11.839 --> 00:59:16.249
become zero. And on the right,
we see the standard deviation of those.
00:59:16.249 --> 00:59:21.140
And that’s the standard
deviation of those points that
00:59:21.140 --> 00:59:22.749
go together with that mean.
00:59:22.749 --> 00:59:28.089
And, again, it’s small close to the
data and large very far of the data.
00:59:28.089 --> 00:59:31.849
And then what we do is we just –
for each of those grid points,
00:59:31.849 --> 00:59:36.469
we randomly sample from this
distribution implied by this mean
00:59:36.469 --> 00:59:41.509
and the standard deviation 100 times.
And that gives us – if we have a source
00:59:41.509 --> 00:59:47.799
somewhere around here, we just find the
closest point and add that prediction at
00:59:47.800 --> 00:59:52.680
that adjustment to the median
prediction for that particular source.
00:59:54.060 --> 01:00:01.060
These are the sites that we use.
So one is somewhere south of San Jose.
01:00:01.060 --> 01:00:07.960
I think that was mainly chosen
because it’s somewhere in a spot
01:00:07.979 --> 01:00:12.439
where we actually have data
and where the uncertainty is low.
01:00:12.439 --> 01:00:15.819
And this is chosen because
that’s where we don’t have data
01:00:15.819 --> 01:00:18.249
and the uncertainty is large.
01:00:18.249 --> 01:00:25.469
And then this is a site west of
San Luis Obispo pretty close to the –
01:00:25.469 --> 01:00:29.520
close to the water.
And you can – I asked Norm if
01:00:29.520 --> 01:00:33.420
we should be more specific
in the paper, and he said no,
01:00:33.420 --> 01:00:37.329
but you can probably
guess which site that is.
01:00:37.329 --> 01:00:41.849
And this just basically shows the
epistemic uncertainty associated with
01:00:41.849 --> 01:00:46.009
median predictions, again
from the Landwehr paper.
01:00:46.009 --> 01:00:50.329
And then, this is how we end
up with these hazard curves.
01:00:50.329 --> 01:00:57.509
So this is calculated hazard
using NGA-West2 as is.
01:00:57.509 --> 01:01:03.829
And this is calculated hazard using
the nonergodic adjustment terms
01:01:03.829 --> 01:01:07.660
from Landwehr and the
cell-specific models.
01:01:07.660 --> 01:01:12.080
So, on the left for San Jose.
On the right for northeastern California.
01:01:12.080 --> 01:01:17.630
Here, we don’t really have data.
That means, on average, our
01:01:17.630 --> 01:01:21.170
adjustment coefficients will be zero,
but it will have very large uncertainty.
01:01:21.170 --> 01:01:24.640
And that means, if we – if they
have very large uncertainty,
01:01:24.640 --> 01:01:30.499
and we sample from that 100 times,
each different run will be very different.
01:01:30.499 --> 01:01:34.089
But if we average them,
we go back to the mean hazard.
01:01:34.089 --> 01:01:37.630
But the fractiles just
become really, really large.
01:01:37.630 --> 01:01:43.820
On the left, we actually do have data.
That means we actually have the
01:01:43.820 --> 01:01:48.820
adjustment coefficients that are
non-zero and that drive the median
01:01:48.820 --> 01:01:53.960
predictions away from their ergodic
values implied by NGA-West2.
01:01:53.969 --> 01:01:59.969
And that means we actually
change the mean hazard.
01:01:59.969 --> 01:02:03.240
On the other hand,
since we have data, we’re also
01:02:03.240 --> 01:02:05.430
a little bit more certain
about those values.
01:02:05.430 --> 01:02:10.380
That means we decrease the fractiles
compared to the [inaudible].
01:02:10.380 --> 01:02:18.309
So we see some larger differences
at the very low exceedance rates.
01:02:18.309 --> 01:02:22.950
Because there’s some contributions
from far-away sources where we
01:02:22.950 --> 01:02:27.900
actually do have a little bit of
data for northeastern California.
01:02:29.089 --> 01:02:35.419
The last thing I want to mention is
that we can do these things in
01:02:35.419 --> 01:02:39.930
California because there’s
a lot of data in California.
01:02:39.930 --> 01:02:46.040
So the – so systematic effects
are pretty strong in California.
01:02:46.040 --> 01:02:52.919
And the thing is, if we go to some
regions where we don’t have data, that
01:02:52.920 --> 01:02:57.880
doesn’t mean that these non-systematic,
nonergodic effects do not exist.
01:02:57.880 --> 01:03:00.199
It just means that we
cannot estimate them.
01:03:00.199 --> 01:03:03.739
But that doesn’t mean that you can’t
really run nonergodic hazard there.
01:03:03.739 --> 01:03:07.419
It’s just more complicated.
You just – if data is sparse,
01:03:07.419 --> 01:03:09.759
you just have to make
more assumptions.
01:03:09.759 --> 01:03:17.239
So one thing that we just did,
and we need to get back to that, is –
01:03:17.240 --> 01:03:20.160
so that’s an example
from Georgia.
01:03:20.160 --> 01:03:27.619
And so I think we had maybe, like,
50 records from Georgia.
01:03:27.619 --> 01:03:31.519
And then we tried to calculate
nonergodic adjustment terms.
01:03:31.519 --> 01:03:35.709
So this is – on the left is nonergodic
adjustment terms relative to
01:03:35.709 --> 01:03:41.300
this particular site here.
And that is, I think, where they have a
01:03:41.300 --> 01:03:46.440
dam. They don’t really have data there,
and they don’t have stations there.
01:03:46.440 --> 01:03:52.560
But this dam is really, really important
for the electricity supply in Georgia.
01:03:52.560 --> 01:03:56.920
So, if something happened to that dam,
it would be pretty bad, and they
01:03:56.920 --> 01:04:02.840
would have to buy electricity
from somewhere else.
01:04:02.840 --> 01:04:06.840
So probably from
neighboring countries.
01:04:06.849 --> 01:04:11.420
So they’re – my understanding is,
they’re a little bit concerned,
01:04:11.420 --> 01:04:15.799
and they wanted to, like, look at that.
And then Norm wanted to go and say,
01:04:15.799 --> 01:04:20.519
look, you actually can do this.
So we – so they have some pocket of
01:04:20.519 --> 01:04:25.569
data here and some data down there.
And we actually calculated
01:04:25.569 --> 01:04:30.580
nonergodic adjustment terms
relative to that particular site.
01:04:30.580 --> 01:04:35.060
And we also calculated nonergodic
adjustment terms relative to a site
01:04:35.061 --> 01:04:39.469
that sits somewhere in that
pocket where we have data.
01:04:39.469 --> 01:04:42.360
So I don’t know if you
can read those numbers.
01:04:44.029 --> 01:04:47.959
The contours sort of make sense.
So here we are – that’s where
01:04:47.960 --> 01:04:51.900
we don’t really know.
The adjustments are pretty constant.
01:04:51.900 --> 01:04:55.880
Whereas, they get a little bit stronger
towards the points where we have
01:04:55.880 --> 01:04:59.789
data because there we have data.
That means we might – but, since we
01:04:59.789 --> 01:05:07.179
only have these, like, I don’t know,
40 data points, the average adjustments
01:05:07.179 --> 01:05:09.750
are extremely small.
01:05:09.750 --> 01:05:13.039
The same thing here.
That shows – oh, yeah.
01:05:13.039 --> 01:05:15.079
That actually shows
the median adjustment.
01:05:15.079 --> 01:05:18.919
So I didn’t put in the variances.
We barely see a dent.
01:05:18.919 --> 01:05:23.989
So you – the variances get a little bit
smaller where you have data,
01:05:23.989 --> 01:05:28.569
but nothing like in California where
we go from 0.5 to 0.2 or something.
01:05:28.569 --> 01:05:33.049
Here, we go from 0.55 to 0.54
or something like that.
01:05:33.049 --> 01:05:39.089
So the data that we have is not really
enough to actually change this.
01:05:39.089 --> 01:05:42.099
But we can still do this.
01:05:42.099 --> 01:05:47.589
So we just have to make some
sort of assumptions how these
01:05:47.589 --> 01:05:53.010
nonergodic effects will be –
will be distributed.
01:05:53.010 --> 01:05:57.650
So what we did here is we just
assumed that the covariant structure
01:05:57.650 --> 01:06:02.960
that we have for California
is transferable to Georgia.
01:06:02.960 --> 01:06:06.459
That means we take the same length
scales and the same variances of the
01:06:06.459 --> 01:06:10.969
nonergodic effects and
apply them in Georgia.
01:06:10.969 --> 01:06:14.540
Whether that is correct or not,
that’s probably …
01:06:16.540 --> 01:06:19.680
It’s probably not totally correct,
but it’s at least a start.
01:06:19.690 --> 01:06:24.229
And then basically, if you just do that,
your hazard will probably look like this.
01:06:24.229 --> 01:06:27.519
So that means the nonergodic hazard
will not be different from the ergodic
01:06:27.519 --> 01:06:31.140
hazard because we don’t know
what the adjustments will be.
01:06:31.140 --> 01:06:34.319
We just have really, really
wide uncertainties.
01:06:34.319 --> 01:06:37.799
These wide uncertainties will
be a lot more realistic than
01:06:37.800 --> 01:06:41.000
uncertainties from
an ergodic model.
01:06:44.800 --> 01:06:54.219
Things to be careful about or things to –
let’s say issues or things to look into
01:06:54.219 --> 01:07:01.579
going forward is – there are
some issues or – yeah, there are
01:07:01.579 --> 01:07:04.329
some issues with the
varying coefficient model.
01:07:04.329 --> 01:07:09.130
And some of them are just
computational, in that is, if you
01:07:09.130 --> 01:07:16.410
remember, like, the full VCM models
everything in the covariance matrix.
01:07:16.410 --> 01:07:24.200
That means we have a covariance matrix
that has non-zero entries everywhere.
01:07:24.200 --> 01:07:27.589
And that means, if we have
12,000 records, we have a covariance
01:07:27.589 --> 01:07:34.309
matrix of 12,000 by 12,000.
That will not – that will crash your
01:07:34.309 --> 01:07:40.650
computer if you try to invert that matrix.
So that’s not something you can just do.
01:07:40.650 --> 01:07:44.429
So the question – so what we have to
do is we have to make some
01:07:44.429 --> 01:07:48.809
approximations, which means,
how good are these approximations
01:07:48.809 --> 01:07:55.920
really to capture – to capture all the
stuff that we want to capture,
01:07:55.920 --> 01:08:03.289
especially if data is maybe not –
is maybe not regularly spaced.
01:08:03.289 --> 01:08:07.019
Often when people do, oh,
I’ve got this new approximation.
01:08:07.019 --> 01:08:11.579
It works really well on this
simulated data set, which is –
01:08:11.579 --> 01:08:15.160
has points everywhere.
That’s not real data, so that’s
01:08:15.160 --> 01:08:20.880
not really how we have. We have
classes of data here, classes of data there.
01:08:20.880 --> 01:08:23.839
So the question is, how good
are these approximations,
01:08:23.839 --> 01:08:26.259
and can they really capture
what we want to have?
01:08:26.259 --> 01:08:28.969
There are tradeoffs
between some parameters.
01:08:28.969 --> 01:08:32.711
So you can often make the variance
larger and maybe the length scale
01:08:32.711 --> 01:08:37.529
a little bit smaller and get
very similar predictions.
01:08:37.529 --> 01:08:45.230
So that is something that
probably needs some sort of input.
01:08:45.230 --> 01:08:50.779
And then there’s a question –
some parameters, I don’t really –
01:08:50.779 --> 01:08:54.420
are not really well-identifiable,
so that means you can just
01:08:54.420 --> 01:08:58.450
swap them out and
get the same results.
01:08:58.450 --> 01:09:05.360
So that’s just computational stuff, and
that’s some – we’re looking into that.
01:09:05.360 --> 01:09:10.480
There is also modeling stuff.
It is kind of – how should we really
01:09:10.489 --> 01:09:13.549
model the covariance function?
Like, right now, we just use the
01:09:13.549 --> 01:09:17.520
exponential covariance function.
You can do some other – you can
01:09:17.520 --> 01:09:22.730
do a squared exponential.
You can do something other.
01:09:22.730 --> 01:09:26.730
You can do [inaudible]
or whatever.
01:09:26.730 --> 01:09:31.340
So does that have an input –
an influence on the results?
01:09:31.340 --> 01:09:34.680
[loud static]
01:09:34.680 --> 01:09:35.900
[static stops]
01:09:35.900 --> 01:09:39.660
I’m sorry.
[laughter]
01:09:39.660 --> 01:09:45.160
Okay. The other question is,
so right now – so we moved everything
01:09:45.170 --> 01:09:48.349
to the covariance function.
But should we maybe move
01:09:48.349 --> 01:09:52.190
some stuff back to the mean.
So if we have a mean function,
01:09:52.190 --> 01:09:58.400
that might be a little bit easier to
put some constraints on some of
01:09:58.400 --> 01:10:01.980
the scaling of the model.
Because it’s harder – it’s harder
01:10:01.980 --> 01:10:05.980
to understand the covariance
function a little bit and what
01:10:05.980 --> 01:10:08.530
that implies for
the covariance.
01:10:08.530 --> 01:10:14.030
And then the other – the big thing is
that people always come back is,
01:10:14.030 --> 01:10:17.230
is it physical? That’s a very
empirical model right now.
01:10:17.230 --> 01:10:22.840
So we just – the data – we showed –
I mean, we checked that we
01:10:22.840 --> 01:10:28.889
don’t do, like, crazy things.
But it’s harder to check the model.
01:10:28.889 --> 01:10:36.179
Because you cannot just look at scaling
at all possible locations in California.
01:10:36.179 --> 01:10:39.199
So that’s just not
humanly possible.
01:10:39.200 --> 01:10:41.380
So …
01:10:43.280 --> 01:10:47.559
And it’s a very empirical model.
I mean, we do check that it predicts
01:10:47.559 --> 01:10:51.270
better than the corresponding
ergodic model.
01:10:51.270 --> 01:10:54.280
And we looked at the residuals
at Napa, and they looked good,
01:10:54.280 --> 01:10:58.400
even though Napa was
not in the data set.
01:10:58.400 --> 01:11:02.630
So they looked better than
the typical NGA model.
01:11:02.630 --> 01:11:08.440
But how can we incorporate physical
constraints in those adjustments?
01:11:08.440 --> 01:11:16.130
That’s a very open question, and I –
yeah, that’s – we don’t really know.
01:11:16.130 --> 01:11:22.280
And then the other question is – I mean,
I kind of touched – is, is it really –
01:11:22.280 --> 01:11:25.820
so right now, we just have the same
length scale for northern California
01:11:25.820 --> 01:11:28.699
and southern California.
Is that really true, or should it
01:11:28.699 --> 01:11:33.400
be something like non-stationary?
That means that we can have the –
01:11:33.400 --> 01:11:36.531
that the covariance function
can actually change – or, the length scale
01:11:36.531 --> 01:11:41.219
can somehow change
depending on where we are.
01:11:41.220 --> 01:11:48.880
That just is possible.
It just makes things very, very –
01:11:48.880 --> 01:11:52.960
it just makes the modeling more
complex and then also the estimation
01:11:52.960 --> 01:11:55.280
of the coefficients
more complex.
01:11:55.280 --> 01:12:01.949
And then, in the end, is how to make
the varying coefficient model and
01:12:01.949 --> 01:12:09.840
how to make it easier applicable
to seismic hazard instead of going
01:12:09.840 --> 01:12:15.020
actually through a logic tree of 100
or 200 sampled realizations.
01:12:16.000 --> 01:12:20.699
So basically, just as a summary,
ergodic seismic hazard
01:12:20.700 --> 01:12:22.960
ignores systematic effects.
01:12:22.960 --> 01:12:28.980
It aggregates data, which is good
because it’s statistically very nice,
01:12:28.980 --> 01:12:35.080
and you can do nice models with it, but
it ignores the systematic source effects.
01:12:35.080 --> 01:12:39.160
And ignoring that will get actually
just biased hazard curves.
01:12:39.170 --> 01:12:43.330
So those hazard curves will be,
on average, correct over California,
01:12:43.330 --> 01:12:47.239
but they will not be correct
at any individual site.
01:12:47.239 --> 01:12:52.699
So – and also, because we aggregate
data, we get a really good handle on
01:12:52.700 --> 01:12:57.080
our coefficients, which means we
get two small uncertainty estimates.
01:12:58.280 --> 01:13:02.460
The variant coefficient model and the
Gaussian process is a very nice model
01:13:02.460 --> 01:13:08.060
to obtain those estimates and the
uncertainty because it automatically –
01:13:08.060 --> 01:13:12.620
it only leads to deviations where
we have data, and it leads to zero
01:13:12.630 --> 01:13:14.960
deviations where
we don’t have data.
01:13:14.960 --> 01:13:19.160
But, at the same time, increases
the uncertainty on [inaudible].
01:13:19.160 --> 01:13:27.100
Site-specific PSHA is
conceptually straightforward.
01:13:27.110 --> 01:13:32.230
We just calculate adjustments for a
bunch of – adjustments for a bunch of
01:13:32.230 --> 01:13:38.130
grid points around the site and then just
look wherever we are in terms of source
01:13:38.130 --> 01:13:44.340
and add that to the median predictions.
And if you – if we do that, we get
01:13:44.340 --> 01:13:47.340
hazard curves that deviate strongly
01:13:47.340 --> 01:13:52.290
from ergodic ones and – if there is data.
But there still are lots of things to
01:13:52.290 --> 01:13:59.270
do and lots of nonergodic models
to estimate for – both for
01:13:59.270 --> 01:14:03.840
California and for lots
of other regions as well.
01:14:03.840 --> 01:14:06.440
And that’s it.
Thanks.
01:14:06.440 --> 01:14:11.040
[Applause]
01:14:11.060 --> 01:14:16.420
- I think we have time for a quick
question or two if anyone has one.
01:14:19.600 --> 01:14:21.760
- Hey, Nico. Thanks.
That was really nice.
01:14:21.760 --> 01:14:25.990
So I think you just sort of started talking
about it a little bit at the end, but maybe
01:14:25.990 --> 01:14:30.719
not right now, but in the future, when
these models become common practice,
01:14:30.719 --> 01:14:36.679
how do you envision people like
me trying to plot them or people
01:14:36.679 --> 01:14:42.090
using them for other applications?
Will you – like, you were talking about,
01:14:42.090 --> 01:14:46.030
should you have a median model that
accompanies them, or – and maybe
01:14:46.030 --> 01:14:50.929
that gets in a little bit to the PSHA.
Right, so you were saying, it’s relatively
01:14:50.929 --> 01:14:54.139
straightforward if you do all these steps.
- When – yes.
01:14:54.139 --> 01:14:58.030
- But if you’re – you know, you want to
just know, what is your ground motion
01:14:58.030 --> 01:15:02.619
at a location for a magnitude 6
at 20 kilometers, how can I –
01:15:02.620 --> 01:15:05.160
or, how do you envision
that happening?
01:15:05.160 --> 01:15:09.320
- I actually envision that happening
by being more thorough with
01:15:09.320 --> 01:15:13.889
documentation and
making code available.
01:15:13.889 --> 01:15:16.679
And not just saying,
look, this is the model.
01:15:16.679 --> 01:15:21.820
This is a – this is a paper
written by a computer scientist.
01:15:21.820 --> 01:15:28.580
So that just needs to happen.
So I think that’s – I think –
01:15:28.580 --> 01:15:32.940
like, we have weekly –
or, biweekly calls about that.
01:15:32.940 --> 01:15:37.820
And Norm is always –
explain this in plain English.
01:15:37.820 --> 01:15:45.070
It’s just hard for me, but I’m not
a native speaker, so – but, no, it’s –
01:15:45.070 --> 01:15:52.260
I mean, it’s just – it’s [inaudible].
So I think, yeah, that will – to make
01:15:52.260 --> 01:15:57.440
the model applicable, there needs to
be a lot more documentation
01:15:57.440 --> 01:16:03.670
and code to be made available.
Which I think will happen.
01:16:03.670 --> 01:16:12.579
So that – and it needs to be a little bit
maybe better explained that, if you –
01:16:12.579 --> 01:16:19.780
that people can still make their
own calculations if you just know
01:16:19.780 --> 01:16:21.960
the variances and
length scale parameters.
01:16:21.960 --> 01:16:23.940
- Yeah. I mean, it’s not so different,
I guess, right now.
01:16:23.949 --> 01:16:25.250
- It’s not.
- You still need to make
01:16:25.250 --> 01:16:26.650
some assumptions …
- Right.
01:16:26.650 --> 01:16:28.959
- On your generic fault
parameters or something.
01:16:28.959 --> 01:16:31.219
- Yes. Right.
- But, yeah, it’s not –
01:16:31.219 --> 01:16:34.459
currently it’s not clearly explained.
- No, I agree, so …
01:16:34.460 --> 01:16:37.580
- And this will take even more …
- Right.
01:16:37.580 --> 01:16:43.770
So I think we’ll – like, the next model,
we’ll try to explain that a little bit better.
01:16:43.770 --> 01:16:48.420
And also, I think, make code available.
- Oh, yeah. Or not opensource.
01:16:48.420 --> 01:16:50.239
A closed code that …
- Yes. [inaudible]
01:16:50.239 --> 01:16:52.960
- … you can’t mess with, but you
just input your location
01:16:52.960 --> 01:16:56.400
and what you want to know.
- Yeah. Maybe something like that.
01:16:56.400 --> 01:16:57.400
- Great.
01:17:02.560 --> 01:17:09.340
- I have a quick question about the
cell-specific attenuation model.
01:17:09.340 --> 01:17:13.500
How do you – do you treat spatial
correlation when you develop that?
01:17:13.500 --> 01:17:19.560
Or just let the data kind of determine …
- No. I just let the data determine it.
01:17:22.650 --> 01:17:28.120
So you can see that a little bit.
If you look at the posterior distributions
01:17:28.139 --> 01:17:31.570
and then plot them for one cell that’s,
like, neighboring, you can
01:17:31.570 --> 01:17:35.409
see some sort of correlation.
Not everywhere.
01:17:35.409 --> 01:17:43.110
So I try to do that, and so I tried
to look – I estimated – it’s called
01:17:43.110 --> 01:17:45.940
a conditional
auto-regressive model.
01:17:45.940 --> 01:17:51.980
So basically, there, you just assume
that neighboring cells are correlated.
01:17:51.980 --> 01:17:54.800
Only neighboring cells.
So that makes the matrix that
01:17:54.800 --> 01:17:59.780
[inaudible], and it makes it
somewhat efficient.
01:17:59.780 --> 01:18:03.680
And things change.
Things change a little bit.
01:18:03.690 --> 01:18:08.180
They mainly change for cells where
you don’t have a lot of data,
01:18:08.180 --> 01:18:12.260
but which are close to cells
where you do have a lot of data.
01:18:12.260 --> 01:18:17.280
Now, on the one hand, you can say,
sure, that should be the case.
01:18:17.280 --> 01:18:20.679
And, I mean, that’s the assumption
that’s made in the varying coefficient
01:18:20.679 --> 01:18:26.059
model, right, that things that are close
together – on the other hand, attenuation
01:18:26.059 --> 01:18:29.539
can actually change pretty quickly.
Like, if you go from – especially if you
01:18:29.539 --> 01:18:31.940
have a fault, you can actually have
a very different attenuation
01:18:31.940 --> 01:18:38.610
between two different faults.
So I don’t think the data is enough to
01:18:38.610 --> 01:18:42.889
actually – I mean, you can do that, and
then you can look at some, like,
01:18:42.889 --> 01:18:47.539
cross-validation and say which one
predicts the data better than the other.
01:18:47.539 --> 01:18:53.309
I don’t think it’s enough to actually
distinguish between those two models.
01:18:53.309 --> 01:19:03.200
So I don’t know how best to do this.
So I would probably prefer not to do it.
01:19:03.200 --> 01:19:08.840
When it is so sort of implied, then in
the model, it comes out of the model.
01:19:08.840 --> 01:19:15.550
But basically it’s – the only thing is,
if there’s no paths through a cell,
01:19:15.550 --> 01:19:19.199
but there’s a lot of paths in the
neighboring cells, do you assume
01:19:19.199 --> 01:19:25.570
it’s the same correlation? And that’s
probably true for some parts and
01:19:25.570 --> 01:19:30.139
not true for other parts.
So – I mean, that’s always – ideally,
01:19:30.140 --> 01:19:34.280
you would have something that
takes care – but we need a lot more
01:19:34.280 --> 01:19:38.500
data to actually distinguish
between those things.
01:19:39.520 --> 01:19:42.520
- Yep. Okay. Thank you.
01:19:43.800 --> 01:19:45.280
- We lost everyone.
01:19:45.280 --> 01:19:48.760
- Yeah, I think everyone
had meetings to get to.