WEBVTT Kind: captions Language: en-US 00:00:02.120 --> 00:00:04.420 Hello. Good morning, everybody. 00:00:04.420 --> 00:00:06.880 Thank you all for coming to the seminar today. 00:00:06.880 --> 00:00:10.519 And before I hand off the microphone to Sarah, who will be introducing 00:00:10.520 --> 00:00:14.720 our speaker, I do have a few announcements. 00:00:14.720 --> 00:00:20.200 So – let’s see – so tomorrow at 9:30, we’ll have an all-hands meeting here. 00:00:20.200 --> 00:00:25.259 And then, on Thursday, we have a group photo and the ice cream social. 00:00:25.259 --> 00:00:30.410 So the group photo will be at noon out on the steps in front of Building 3A. 00:00:30.410 --> 00:00:34.930 And then the ice cream social will be between 3A and 15 at 1:30. 00:00:34.930 --> 00:00:37.210 So please show up to that. 00:00:37.210 --> 00:00:41.440 And then our next seminar will be on Wednesday, July 3rd, 00:00:41.440 --> 00:00:46.700 at the usual time and place. Okay, so here is Sarah. 00:00:48.480 --> 00:00:50.960 Good morning, everyone. I am very pleased to introduce 00:00:50.969 --> 00:00:53.399 this morning’s speaker, Stephen Wu. 00:00:53.399 --> 00:00:55.940 Stephen is one of my favorite collaborators. 00:00:55.940 --> 00:00:59.880 He got his BSE in civil and environmental engineering and B.S. 00:00:59.880 --> 00:01:02.840 in mathematics from the University of Michigan and a master’s and 00:01:02.840 --> 00:01:06.480 Ph.D. from Caltech in mechanical and civil engineering with a minor 00:01:06.480 --> 00:01:10.570 in geophysics, where he worked with Jim Beck on a thesis entitled 00:01:10.570 --> 00:01:13.430 The Future of Earthquake Early Warning – Quantifying Uncertainty 00:01:13.430 --> 00:01:16.720 and Making Fast Automated Decisions for Applications. 00:01:16.720 --> 00:01:20.120 He then was a postdoc at ETH working on uncertainty quantification 00:01:20.120 --> 00:01:23.000 and multi-scale simulations. And he’s currently an assistant 00:01:23.000 --> 00:01:26.580 professor at the Institute of Statistical Mathematics in Japan, where he works 00:01:26.580 --> 00:01:29.940 on bioinformatics, material informatics, and seismology. 00:01:29.940 --> 00:01:33.190 His research interests include machine learning, data simulation, 00:01:33.190 --> 00:01:36.790 molecular dynamics, formal kinetics, automated decision-making, 00:01:36.790 --> 00:01:39.420 complex network reliability, compression sensing, 00:01:39.420 --> 00:01:42.840 and optimal sensor placement strategy. Did I leave anything out, Stephen? 00:01:42.840 --> 00:01:46.240 - [inaudible] [laughter] - Okay. Awesome. 00:01:46.240 --> 00:01:48.380 Welcome, Stephen. 00:01:52.660 --> 00:01:56.020 - Oops. [laughs] Thank you very much. It’s great to be here. 00:01:56.020 --> 00:01:59.340 Thanks for the opportunity, and thanks a lot, Sarah, for the 00:01:59.340 --> 00:02:04.640 invitation and great introduction [laughs] to my background. 00:02:04.640 --> 00:02:09.670 Pretty much, I work on anything I feel interested in. 00:02:09.670 --> 00:02:15.760 So I’m not a very devoted researcher in a particular field. 00:02:15.760 --> 00:02:20.900 But that actually makes it interesting to have a chance to collaborate 00:02:20.900 --> 00:02:25.940 with many kind of different people. So I enjoyed it very much. 00:02:25.940 --> 00:02:31.260 Today I will talk about the so-called IPF methods. 00:02:32.480 --> 00:02:39.040 This is interesting, not just in terms of the fact that all the methodology is 00:02:39.049 --> 00:02:45.489 a little bit different than the things that has been doing in Japan in JMA, 00:02:45.489 --> 00:02:49.719 but if people know about JMA, know about Japan, you know that 00:02:49.720 --> 00:02:53.700 JMA usually do things quite independently. 00:02:53.700 --> 00:02:59.120 No matter how much research has been going on outside, it’s quite hard 00:02:59.120 --> 00:03:04.420 to sort of get things done together. And this is probably one of the rare 00:03:04.420 --> 00:03:12.170 cases where – as you can see, the DPRI-Kyoto University is part of it. 00:03:12.170 --> 00:03:17.599 And so it’s kind of a rare case where things actually get together. 00:03:17.600 --> 00:03:22.340 So I hope I didn’t say anything politically incorrect. [laughter] 00:03:22.340 --> 00:03:24.700 But I think you know what I mean. [laughs] 00:03:24.700 --> 00:03:30.239 So – and of course, methodology-wise, this is interesting to me too. 00:03:30.239 --> 00:03:37.409 Because it’s sort of doing one step forward than deterministic prediction. 00:03:37.409 --> 00:03:40.919 We talk about uncertainty here. So let me show you first – 00:03:40.919 --> 00:03:45.049 what I plan to do is to show a little bit of the background 00:03:45.049 --> 00:03:49.620 of the motivation of doing this. And then I will introduce the methods. 00:03:49.620 --> 00:03:53.769 And I think people here will be more interested in the actual implementation. 00:03:53.769 --> 00:03:58.780 Because the actual theoretical implementation versus the actual 00:03:58.780 --> 00:04:02.279 implementation, there’s some gap there. So there’s some [inaudible] things 00:04:02.279 --> 00:04:07.559 probably going around behind the screen. I will introduce that too. 00:04:07.560 --> 00:04:12.060 And then a bunch of case study, which I think is quite interesting. 00:04:13.120 --> 00:04:17.660 So let me officially start it with the motivation. 00:04:20.300 --> 00:04:26.819 I think, to me, this is basically the way I look at my – when I set up 00:04:26.819 --> 00:04:30.690 research problem myself. I mean, this is just a very general thing 00:04:30.690 --> 00:04:33.919 about doing scientific research, right? You have always starting 00:04:33.919 --> 00:04:37.280 from a problem. You make a – you make a – 00:04:37.280 --> 00:04:41.240 you frame the problem in some way where – so that you can collect data 00:04:41.240 --> 00:04:45.190 to help you to do some modeling. And then, from there, you probably 00:04:45.190 --> 00:04:48.360 do decision-making, right? This is very general. 00:04:48.360 --> 00:04:52.520 And, in the process, probably very likely, you will try to get some update 00:04:52.520 --> 00:04:58.310 and feedback loop where it will help you to improve your model. 00:04:58.310 --> 00:05:02.840 And particularly in this updating process and the decision-making 00:05:02.840 --> 00:05:07.349 process, it’s very likely there’s many things that you don’t know exactly. 00:05:07.349 --> 00:05:11.050 So there’s uncertainty comes in. The question is, can you utilize 00:05:11.050 --> 00:05:16.050 also the uncertainty to make the process even more efficient? 00:05:16.050 --> 00:05:18.780 And this is what I’m interested in. 00:05:18.780 --> 00:05:23.400 And I think we can always boil down every problem into 00:05:23.400 --> 00:05:26.770 a simple regression problem that looks like this, right? 00:05:26.770 --> 00:05:30.880 This is the whole science and the whole human history. 00:05:32.440 --> 00:05:36.639 The magic behind this is that, pretty much, you can model everything 00:05:36.639 --> 00:05:42.289 with a mean process with things that you know, and you think this is a trend. 00:05:42.289 --> 00:05:45.440 And so basically, you are building a model to feed the trend. 00:05:45.440 --> 00:05:49.220 Or you try to understand the trend. Either physically, using a physical 00:05:49.220 --> 00:05:53.400 model, or now everyone knows about the term “machine learning,” right? 00:05:53.400 --> 00:05:58.500 So you blindly cover your eyes and just let the computer build some 00:05:58.500 --> 00:06:03.409 model based on data – based on some statistical methods. 00:06:03.409 --> 00:06:06.099 And so, from there, you get a prediction. You get a mean prediction. 00:06:06.099 --> 00:06:08.860 And in earthquake early warning system, of course, 00:06:08.860 --> 00:06:15.100 this is basically the ground prediction model that we're often using – 00:06:15.100 --> 00:06:18.159 or, later on, there’s more development from there. 00:06:18.159 --> 00:06:21.180 So different model are used. But that’s basically it. 00:06:21.180 --> 00:06:27.460 Now, the question is, this uncertainty part – what I call – or many people 00:06:27.460 --> 00:06:31.889 call this error, noise, residual – whatever you call it. 00:06:31.889 --> 00:06:35.840 What it’s representing is things that you don’t know, right? 00:06:35.840 --> 00:06:39.970 I think there’s a quote in – quite famous in U.S. these days about things you 00:06:39.970 --> 00:06:43.690 know, you know, or you – whatever. [laughter] 00:06:43.690 --> 00:06:46.699 But this is about it. 00:06:46.700 --> 00:06:52.760 So can you really – we are very good at putting model and creating model 00:06:52.760 --> 00:06:57.940 for the trend, for predicting the expected value – the mean value. 00:06:57.940 --> 00:07:04.509 But I think we’re less used to create a model for the – for these kind of 00:07:04.509 --> 00:07:09.270 things we don’t know. Or basically the uncertainty, that I call. 00:07:09.270 --> 00:07:14.730 So here I think what I’m trying to show is that, by having a model there, 00:07:14.730 --> 00:07:18.840 sometimes it does give you some benefit. 00:07:20.000 --> 00:07:24.340 So earthquake early warning – this is the – probably the 00:07:24.340 --> 00:07:27.560 most fundamental one. 00:07:27.560 --> 00:07:33.620 We are trying to predict the shaking at a location with some, let’s say, 00:07:33.620 --> 00:07:37.669 ground motion prediction model. And that would be a function of 00:07:37.669 --> 00:07:42.370 the earthquake source parameter, magnitude, epicenter location, 00:07:42.370 --> 00:07:44.560 origin time. And of course, the location 00:07:44.560 --> 00:07:50.120 of where you want to predict the intensity – the shaking intensity. 00:07:50.120 --> 00:07:54.340 Now, on top of that, again, there’s this error term. 00:07:54.340 --> 00:07:56.800 And the input data, of course, would be the seismic data you 00:07:56.800 --> 00:08:00.500 get from a particular location, right? 00:08:00.500 --> 00:08:08.060 And now, this – I mean, I don’t have to go too much just into details of the early 00:08:08.060 --> 00:08:13.220 warning system here. I think people are familiar with that, for sure. 00:08:13.220 --> 00:08:18.039 So let me jump all the way to look at the Japan case. 00:08:18.039 --> 00:08:21.659 Now, early warning system has been implemented in Japan 00:08:21.659 --> 00:08:26.189 for quite a long time. And so it’s interesting that they 00:08:26.189 --> 00:08:31.199 do have quite a lot of statistics. I’m not sure how openly people 00:08:31.199 --> 00:08:36.260 know about them because a lot of them are in Japanese only. 00:08:36.260 --> 00:08:41.000 But if I put – try to put out different documents – in fact, it’s a colleague 00:08:41.000 --> 00:08:44.720 of mine in Japan who’s doing all this hard work. 00:08:44.720 --> 00:08:49.070 Here is some of the statistics that we can summarize. 00:08:49.070 --> 00:08:56.660 Well, we know why, for sure [laughs] – 2011, suddenly there’s a lot of – 00:08:56.660 --> 00:09:01.850 a huge number of warning. And of course, we can expect 00:09:01.850 --> 00:09:06.760 why this is happening. Like, a huge error here. 00:09:06.760 --> 00:09:12.440 The accuracy drops significantly in 2011 because of famous earthquake. 00:09:12.440 --> 00:09:20.540 So people are very alert of this 2011 issue. 00:09:20.540 --> 00:09:25.360 Now, we break it down, and we look at what’s actually happening. 00:09:25.360 --> 00:09:27.680 We try to look at what’s actually happening. 00:09:27.680 --> 00:09:33.320 Before this Tohoku earthquake in 2011, let’s say we have around 00:09:33.320 --> 00:09:39.110 20-something earthquake. The one that has prediction 00:09:39.110 --> 00:09:44.250 of the error – well, this error we define based on the Japanese intensity 00:09:44.250 --> 00:09:48.950 measure. Whenever it’s greater than 2, we define it as a false alarm. 00:09:48.950 --> 00:09:51.790 So basically, all this red part is the false alarm. 00:09:51.790 --> 00:09:56.000 So you see around 20-something warning – 00:09:56.000 --> 00:10:00.910 the JMA early warning system was only having, like, five error. 00:10:01.880 --> 00:10:06.920 Now, after Tohoku, we have – like, let’s say we look at around 70 warnings. 00:10:06.930 --> 00:10:10.370 We have 44 of them wrong. 00:10:10.370 --> 00:10:14.970 And if we want to understand why, we look at all these 44 cases. 00:10:14.970 --> 00:10:19.600 We realize that, around 32 of them are basically due to the fact that they’re 00:10:19.600 --> 00:10:23.800 simultaneously all very close in time. There’s multiple event happening 00:10:23.810 --> 00:10:29.650 around. So this is basically the whole motivation of our study, to develop 00:10:29.650 --> 00:10:34.450 something that is based on uncertainty so that we can eventually capture 00:10:34.450 --> 00:10:37.800 the cases where there’s multiple events. 00:10:38.810 --> 00:10:43.800 Because the previous system of JMA early warning system has no ability 00:10:43.800 --> 00:10:48.580 to tackle what we call the multi-event cases. 00:10:49.740 --> 00:10:55.920 Specifically, what happened is that this is just one particular case in 2011, 00:10:55.920 --> 00:11:01.840 March 15. There was two very small events happening in Japan. 00:11:01.840 --> 00:11:06.540 And what the sensor gets is something like that. 00:11:06.540 --> 00:11:12.190 Around the epicenter, you can see some triggering of the stations. 00:11:12.190 --> 00:11:15.389 The red one is the trigger stations. 00:11:15.389 --> 00:11:20.570 But what the early warning system is looking is, it feels like it’s 00:11:20.570 --> 00:11:27.920 a huge one up there, and it’s just that the wave has propagated 00:11:27.920 --> 00:11:31.980 and gets some significant trigger here as well. 00:11:31.980 --> 00:11:36.430 So this is how a very typical multi-event case becomes 00:11:36.430 --> 00:11:39.160 a false warning in early warning system. 00:11:39.940 --> 00:11:43.860 Well, for us, we look at it, we know for sure this is such a easy decision 00:11:43.860 --> 00:11:47.880 that it’s not one big earthquake. This is two small ones. 00:11:47.880 --> 00:11:52.389 But how can a simple early warning system, when you have limited time, 00:11:52.389 --> 00:11:56.399 can make such a decision? Can we make a simple logic 00:11:56.399 --> 00:12:01.389 to allow the system to make very fast decision so that it can 00:12:01.389 --> 00:12:06.260 robustly identify multiple events from single events? 00:12:08.010 --> 00:12:12.860 The approach I will use is this so-called Bayesian approach, but at the end of 00:12:12.860 --> 00:12:18.260 the day, because the calculation of the Bayesian inversion is usually super 00:12:18.260 --> 00:12:22.500 computationally intensive – so, at the end, we do a lot of approximation. 00:12:22.500 --> 00:12:27.190 And, in fact, our paper published, and people actually look at it 00:12:27.190 --> 00:12:30.010 and say, well, this is not quite Bayesian. [laughs] 00:12:30.010 --> 00:12:36.279 Well, I would say yes. Because of after all the approximations, 00:12:36.279 --> 00:12:41.430 it looks quite similar to the other non-Bayesian method as well. 00:12:41.430 --> 00:12:44.959 But I will – I will just say here, it’s sort of a Bayesian source inversion 00:12:44.960 --> 00:12:50.320 because we do take this equation as a starting point. 00:12:51.280 --> 00:12:55.420 And let me explain what I’m meaning here. 00:12:55.420 --> 00:12:58.140 These are the data. You can imagine all the seismic data here. 00:12:58.149 --> 00:13:02.540 And so we are – I’m talking about the posterior distribution of the probability 00:13:02.540 --> 00:13:09.920 of all the theta – theta is the earthquake – or, the source parameters. 00:13:09.920 --> 00:13:13.709 So the probability of all the source parameter, given data that you have 00:13:13.709 --> 00:13:18.389 been observing so far, would, by base theorem, would be just equal to the 00:13:18.389 --> 00:13:24.670 probability of observing such a data under a specific given source parameter, 00:13:24.670 --> 00:13:30.640 times the probability of the source parameter before you see any data – 00:13:30.640 --> 00:13:32.500 what we call the prior. 00:13:32.500 --> 00:13:37.780 And there’s this denominator as well. But the denominator is actually 00:13:37.790 --> 00:13:42.830 not so important because, as you can see, we integrate out the theta, 00:13:42.830 --> 00:13:47.170 means that this posterior distribution is proportional to the top. 00:13:47.170 --> 00:13:50.240 So the bottom is just a normalizing constant. 00:13:50.240 --> 00:13:55.360 But, in Bayesian statistics, this normalizing constant is quite important 00:13:55.360 --> 00:14:00.009 when you have multiple models. Because this is basically the term that 00:14:00.009 --> 00:14:05.900 helps you to decide which model is more likely in a statistical sense. 00:14:08.080 --> 00:14:12.959 So briefly speaking, this is just a – when you integrate out the theta, 00:14:12.960 --> 00:14:17.160 this is just the probability of seeing the data without any condition – 00:14:17.160 --> 00:14:21.050 or conditioning on the specific model. And so this is the term that helps you 00:14:21.050 --> 00:14:23.769 to do the model selection. However, calculation of this integral 00:14:23.769 --> 00:14:28.459 would be super hard in most case. And so what we are doing here in 00:14:28.459 --> 00:14:34.860 this implementation for using Bayesian in early warning is that we focus on 00:14:34.860 --> 00:14:41.460 the upper part. Because this posterior is proportional to these two terms. 00:14:41.460 --> 00:14:46.420 We use a sampling scheme to try to estimate this posterior distribution. 00:14:46.420 --> 00:14:51.490 So, instead of having a actual mathematical form, we will use a bunch 00:14:51.490 --> 00:14:56.660 of sample that’s drawn from this distribution to estimate the distribution. 00:14:57.480 --> 00:15:03.540 And, for the bottom part, we would just take some approximation scheme. 00:15:03.540 --> 00:15:08.000 And this is the part to help us to decide whether this case – this particular 00:15:08.000 --> 00:15:12.860 case is having only one earthquake or having multiple earthquake. 00:15:13.500 --> 00:15:19.000 Now, let me skip this part for now and only focus on the top part. 00:15:19.000 --> 00:15:27.000 The top part – well, first of all, I will focus on the likelihood term, that I call. 00:15:27.000 --> 00:15:31.459 Because the prior term is just what you expect may have a earthquake without 00:15:31.459 --> 00:15:35.529 any data. So there’s a lot of things you can just do randomly. 00:15:35.529 --> 00:15:39.780 Let’s say you focus on a particular area, and you ignore a certain area – 00:15:39.780 --> 00:15:42.290 you say there’s no earthquake and things like that. 00:15:42.290 --> 00:15:46.860 But the likelihood is something that you have to feed the data in to update 00:15:46.860 --> 00:15:50.480 your prediction on where the earthquake is actually happening. 00:15:50.480 --> 00:15:55.399 And the way to do that, for our case, we focus on two parts. 00:15:55.399 --> 00:16:02.829 One is the waveform amplitudes residual – well, let me – let me 00:16:02.829 --> 00:16:06.500 not say residual for now. So we just try to write the likelihood 00:16:06.500 --> 00:16:11.370 in terms of the waveform amplitudes and the picking time. 00:16:11.370 --> 00:16:16.190 So when we have the original data is the full seismic data, of course, 00:16:16.190 --> 00:16:22.100 from all the stations, but we can only – we only focus on certain information, 00:16:22.100 --> 00:16:26.120 or certain feature, in the waveform data. 00:16:26.740 --> 00:16:28.480 And here, we pick out two of them. 00:16:28.480 --> 00:16:34.129 Well, if you take the log of the distribution, this is – we assume these 00:16:34.129 --> 00:16:39.470 two terms are individually all Gaussian, then take the log of it, you see some 00:16:39.470 --> 00:16:47.089 form of a summation of the residuals. So this is a typical modeling that we do. 00:16:47.089 --> 00:16:52.699 Because Gaussian is a very easy distribution to model – to be used. 00:16:52.699 --> 00:16:59.569 So, at the end of the day, with all this, like, mathematical theory that’s 00:16:59.569 --> 00:17:02.529 written there, and we are just looking at residual. 00:17:02.529 --> 00:17:07.900 So you may say, well, [laughs] why even bother talking about Bayesian? 00:17:07.900 --> 00:17:13.460 Well, these residual, at the end of the day, will be the driving force for me 00:17:13.470 --> 00:17:18.740 to do the sampling, and I will use these terms to help me to estimate the 00:17:18.740 --> 00:17:25.680 uncertainty of my prediction at the end. So hopefully you will see it later. 00:17:26.460 --> 00:17:29.860 But, at this point, I would just say that we focus on these two, and the question 00:17:29.860 --> 00:17:33.810 is, how do we actually get this number? The amplitude part, I think is 00:17:33.810 --> 00:17:38.850 pretty simple. You just have your source model do the prediction. 00:17:38.850 --> 00:17:42.610 And then you take the actual observed one, and then you can calculate them. 00:17:42.610 --> 00:17:47.860 This sigma term is something that we pre-determined. 00:17:47.860 --> 00:17:52.000 And so it’s kind of a pretty cool thing that we do here. 00:17:52.000 --> 00:17:55.170 Let me directly jump to the actual algorithm for doing 00:17:55.170 --> 00:17:59.929 all these things that I just mentioned. 00:17:59.929 --> 00:18:04.169 As I said, I just want to do sampling out of this equation, right? 00:18:04.169 --> 00:18:08.809 So what I’m doing here is that I first draw sample from the prior. 00:18:08.809 --> 00:18:13.390 What it means it simply is to say that, let’s look at a particular area that I think 00:18:13.390 --> 00:18:17.710 there may be a earthquake with some criteria that I will tell you later. 00:18:17.710 --> 00:18:24.360 Then I will just put some samples – means some – I will focus – 00:18:24.360 --> 00:18:27.920 like putting down a grid, for example, in a certain area. 00:18:27.920 --> 00:18:34.180 I would just then calculate the amplitude and estimate the picking 00:18:34.180 --> 00:18:42.080 time based on – oops. The likelihood function at all point of these grids. 00:18:42.080 --> 00:18:46.140 So, with these grids, I would do all this likelihood calculation 00:18:46.140 --> 00:18:50.590 that you saw in the previous slide. And I will find this likelihood value 00:18:50.590 --> 00:18:55.220 for every single grid point. And I call them the weights. 00:18:55.220 --> 00:18:58.860 So I’m basically looking at, at this grid point, which point 00:18:58.860 --> 00:19:02.390 is more likely to be the source of the earthquake. 00:19:02.390 --> 00:19:06.840 When I first have the first time-step data. 00:19:06.840 --> 00:19:10.539 Now, afterwards, because of some statistical reasons, I have to 00:19:10.539 --> 00:19:15.900 do some so-called resampling that has basically allowed the dots 00:19:15.900 --> 00:19:20.880 to move around a little bit and then, based on the weights. 00:19:21.700 --> 00:19:26.180 And I will recalculate the weights again when there is new data coming in. 00:19:26.190 --> 00:19:30.890 Because you will see more and more waveform data coming in as time goes 00:19:30.890 --> 00:19:37.000 by. And then I do resampling again, and I just keep looping this process. 00:19:37.000 --> 00:19:41.440 So this is a very simple algorithm. You start with a grid. 00:19:41.440 --> 00:19:46.010 You calculate the weights representing the likelihood of this particular grid 00:19:46.010 --> 00:19:49.860 point to be the source of the earthquake, 00:19:49.860 --> 00:19:55.680 and then you just move around the dots and just keep looping, right? 00:19:58.360 --> 00:20:04.620 Sorry. So this algorithm allows me to get a map, basically, to see where it’s 00:20:04.630 --> 00:20:09.289 most likely to be in, and I will show you exactly how this map looks like later. 00:20:09.289 --> 00:20:14.830 But imagine – I mentioned about the fact that we have to decide if there’s 00:20:14.830 --> 00:20:19.920 multiple earthquake or not, right? And this is not in this loop. 00:20:22.380 --> 00:20:25.930 The way we actually decide if there’s multiple earthquake or not is, 00:20:25.930 --> 00:20:32.480 after seeing this – working on this loop, at every time step, we have a bunch of 00:20:32.480 --> 00:20:37.059 dots with its weight, right? And this information is – 00:20:37.060 --> 00:20:46.460 we can use it to sort of look at the variance of these rates. Oops. 00:20:46.460 --> 00:20:50.679 And try to use that to make some decision if my prediction 00:20:50.679 --> 00:20:55.400 is quite bad or not. And if my prediction is quite bad, 00:20:55.400 --> 00:21:00.220 I basically say, okay, I think one earthquake is not very likely because 00:21:00.220 --> 00:21:03.789 it’s not predicting everything very well, and I would just add on maybe 00:21:03.789 --> 00:21:08.030 a multiple earthquake. And the process – instead of 00:21:08.030 --> 00:21:11.909 talking about the math behind it, let me just look at – help go through 00:21:11.909 --> 00:21:18.900 one case study to see how we do it exactly. 00:21:18.900 --> 00:21:24.420 First of all, we have the criteria in the JMA system right now that is – 00:21:24.429 --> 00:21:30.990 we require three trigger station to define there’s actually an earthquake. 00:21:30.990 --> 00:21:36.429 So we start with – whenever we have a single trigger, we would call – 00:21:36.429 --> 00:21:41.360 we would create a pool – we call pending events. 00:21:41.360 --> 00:21:45.030 So whenever there’s a trigger, we call it a pending event. 00:21:45.030 --> 00:21:49.470 And later on, when we have more and more triggers, we will decide whether 00:21:49.470 --> 00:21:54.440 this event belongs to the same event – this trigger belongs to the same event. 00:21:54.440 --> 00:21:56.760 And we group them inside the pending event. 00:21:56.760 --> 00:22:00.000 So these pending event is kind of simple rule-based. 00:22:00.000 --> 00:22:03.720 And, at the end, when we have enough of triggers that we decide to be the 00:22:03.720 --> 00:22:10.169 single event, then, when we have – the point that we reach three triggers, 00:22:10.169 --> 00:22:14.320 we say this is probably existing events, 00:22:14.320 --> 00:22:20.300 and we will start actually doing all these sampling calculations. 00:22:20.760 --> 00:22:25.480 And, when we have a new trigger, we have to decide, this is actually 00:22:25.490 --> 00:22:28.919 the same event or not. And the decision is based on 00:22:28.919 --> 00:22:32.559 what I showed you about these ways that I’ve calculated 00:22:32.560 --> 00:22:36.700 for the events, and we look at the variance and stuff. 00:22:36.700 --> 00:22:41.300 And we’ve – that’s sort of what I call the approximate Bayesian calculation. 00:22:41.309 --> 00:22:46.720 With that decision, I can decide whether it’s the same event or not. 00:22:46.720 --> 00:22:50.540 And, if there is one trigger that is not the same event, 00:22:50.540 --> 00:22:54.220 then it would be, again, created independent events. 00:22:54.220 --> 00:22:57.150 And from there, I would keep track on the multiple events. 00:22:57.150 --> 00:23:03.390 So, theoretically speaking, what I’m doing is just – I sort of moving from – 00:23:03.390 --> 00:23:06.100 keep tracking of potential events. 00:23:06.100 --> 00:23:11.330 And then try to use some rules to decide they are the same or not in real time. 00:23:11.330 --> 00:23:16.090 And mathematically speaking, this is basically a kind of approximate model 00:23:16.090 --> 00:23:20.950 selection, but I only do a sort of a gridded search starting from one event, 00:23:20.950 --> 00:23:24.200 and then I decided if there’s extra one or not. 00:23:24.200 --> 00:23:27.940 So, instead of gridded search, of course, I can, at any time, 00:23:27.940 --> 00:23:32.300 at any point, I check one event, two event, three event, 00:23:32.309 --> 00:23:36.110 four event – I check all the – all of them, the probability of them. 00:23:36.110 --> 00:23:40.020 And then I take the biggest one with the biggest probability. 00:23:40.020 --> 00:23:43.899 But we have checked that, given the calculation time 00:23:43.899 --> 00:23:47.220 we have in each seconds, this is really hard to implement. 00:23:47.220 --> 00:23:51.760 And that’s why we decided to use these kind of approximate methods. 00:23:53.880 --> 00:23:58.820 There’s a few things – little things that we have implemented in the system as well. 00:23:58.820 --> 00:24:00.760 First thing, in – particularly in Japan, 00:24:00.760 --> 00:24:04.800 there’s two set of seismometers that we can use. 00:24:04.800 --> 00:24:08.630 One is the JMA system. We have around 300 stations. 00:24:08.630 --> 00:24:13.850 Another one is the Hi-net stations. We have 700 of them. 00:24:13.850 --> 00:24:19.799 And, before our implementation, they are separately looked 00:24:19.799 --> 00:24:22.660 and used [inaudible]. 00:24:22.660 --> 00:24:27.460 And what we say is that, well, there’s just no reason why not combining them. 00:24:27.460 --> 00:24:31.710 So we tried to do something to combine the results and use them all in our 00:24:31.710 --> 00:24:35.830 system. So keep in mind that all the simulation I’ve run is based on 00:24:35.830 --> 00:24:40.649 the combined of the two system. And the second thing that we’ve done 00:24:40.649 --> 00:24:46.080 is to try to take advantage of as much data as possible. 00:24:46.080 --> 00:24:48.940 Because we have so many stations. 00:24:49.780 --> 00:24:54.680 What used to be done in JMA is that, whenever there is trigger, those station 00:24:54.690 --> 00:24:59.850 would be activated, and the information will be used in the calculation for 00:24:59.850 --> 00:25:05.280 your source – earthquake source. So basically, there’s separated 00:25:05.280 --> 00:25:10.850 six stations there. And they are fed back into the algorithm. 00:25:10.850 --> 00:25:14.840 But now, in our implementation, in some way, we are also using 00:25:14.840 --> 00:25:20.669 the information that all those other stations are not triggered. 00:25:20.669 --> 00:25:24.930 And specifically how we’ve done it in our implementation is that, if you look 00:25:24.930 --> 00:25:30.159 back how we calculate our likelihood using both amplitudes information 00:25:30.159 --> 00:25:36.049 and pick time information, we implement this information 00:25:36.049 --> 00:25:41.620 in the picking time likelihood. The way we look at it is that we – 00:25:41.620 --> 00:25:47.130 you have the theoretical – well, assume we see an earthquake event. 00:25:47.130 --> 00:25:51.299 Then you have the theoretical arrival time – let’s say the 00:25:51.299 --> 00:25:54.559 P wave arrival time – at a certain station. 00:25:54.559 --> 00:25:59.429 Now, so you can look at, theoretically, they have arrived or not. 00:25:59.429 --> 00:26:03.409 And, if there – on the other hand, for a particular station, you can look at 00:26:03.409 --> 00:26:09.090 if they are triggered or not, right? So you have this typical table 00:26:09.090 --> 00:26:16.090 of four boxes. When the station is triggered, and theoretically, there is – 00:26:16.090 --> 00:26:21.280 picking time exists, you can just use the values and calculate the residual. 00:26:21.280 --> 00:26:27.130 This is typically how the deterministic method is also used. 00:26:27.130 --> 00:26:31.690 What is not often used is the remaining three box in the – 00:26:31.690 --> 00:26:36.560 in the traditional methods. First of all, if the station is not triggered, 00:26:36.560 --> 00:26:41.000 but theoretically, there is no – 00:26:41.000 --> 00:26:45.740 sorry, I’m – when the station is triggered, but theoretically, there is – 00:26:45.740 --> 00:26:49.720 you don’t have a picking time, then what we are doing now is we 00:26:49.720 --> 00:26:55.840 just assume the theoretical picking time may be coming in the future. 00:26:56.770 --> 00:27:02.160 We just use – take the theoretical time to be current time. 00:27:02.179 --> 00:27:05.080 Or, depending on your system, you can take it as a current time 00:27:05.080 --> 00:27:12.360 plus the time interval, basically looking at the next time step. 00:27:12.360 --> 00:27:15.580 The other way around – if, theoretically, you think there is a picking time, 00:27:15.580 --> 00:27:19.279 but the station has not been triggered yet, you may assume, in the next time 00:27:19.279 --> 00:27:23.409 step, there’s a potential it will be triggered, or, in some way, you just 00:27:23.409 --> 00:27:28.140 set the picking time to the next time step or the current time step. 00:27:28.140 --> 00:27:33.100 And the whole point of doing that is to penalize, as the time goes by, 00:27:33.100 --> 00:27:36.740 the fact that there’s a difference between there’s theoretically no, 00:27:36.740 --> 00:27:42.300 theoretically picking time or not, versus there’s actual trigger or not. 00:27:42.309 --> 00:27:46.900 And, of course, when they agree that there’s no earthquake – there’s 00:27:46.900 --> 00:27:52.700 no picking time, then the likelihood function would just be 1, which means 00:27:52.700 --> 00:27:57.510 the likelihood would be zero for that station. Basically, that means there’s no 00:27:57.510 --> 00:28:02.460 contribution of information from there. So this way, we will be able to 00:28:02.460 --> 00:28:07.040 also utilize the fact that some station are not triggered. 00:28:07.909 --> 00:28:13.180 Another thing that we’ve implemented is the way we choose 00:28:13.190 --> 00:28:17.020 what are the stations to be used for a particular event. 00:28:17.020 --> 00:28:21.870 In Japan, there’s a very specific scenario where we have a lot of stations. 00:28:21.870 --> 00:28:28.940 So, if you just use every single information from every stations, 00:28:28.940 --> 00:28:35.630 after certain time step, you realize that your likelihood calculation 00:28:35.630 --> 00:28:39.279 become very heavy, first of all. The second thing is it’s very hard to 00:28:39.279 --> 00:28:45.010 get accurate uncertainty quantification. Because we are assuming every single 00:28:45.010 --> 00:28:49.769 station has independent information, but in reality, there’s always 00:28:49.769 --> 00:28:54.260 some correlation there. And so, after certain number of stations 00:28:54.260 --> 00:28:59.010 information being used, you actually have some trouble 00:28:59.010 --> 00:29:02.760 doing a nice convergence of your solution. 00:29:02.760 --> 00:29:08.799 Here, instead, we fix ourselves to use the – only 20 station at max 00:29:08.799 --> 00:29:13.179 for single events. But the question now is, how do we choose so many station 00:29:13.179 --> 00:29:19.309 out of this existing catalog of stations? We realize, if you just take the closest 00:29:19.309 --> 00:29:23.529 distance, this would not be good because, especially in the coastline, 00:29:23.529 --> 00:29:28.029 it would be a very bad coverage of the – for the coastline. 00:29:28.029 --> 00:29:34.000 So what we’ve done is we just look at the Voronoi cell for each station when 00:29:34.000 --> 00:29:41.300 they are first triggered. And we take that cell’s mean as a reference point. 00:29:41.300 --> 00:29:46.470 First, we always want a station around the first trigger station. 00:29:46.470 --> 00:29:51.279 So we pick 10 stations that is closest to them. 00:29:51.279 --> 00:29:57.590 And then, the next 10 stations we pick, we want to maximize the azimuth 00:29:57.590 --> 00:30:00.590 coverage instead of just looking at the distance. 00:30:00.590 --> 00:30:06.630 And using these two rules, we will be able to automatically pick stations that 00:30:06.630 --> 00:30:12.019 looks like the left-hand side here, when you’re on the coastal – 00:30:12.019 --> 00:30:14.909 this one when you’re around the coastline. 00:30:14.909 --> 00:30:20.470 And, if you are in the middle, it would automatically pick stations like that. 00:30:20.470 --> 00:30:26.909 So this is the rule that we use to – before – this is – this is the table, 00:30:26.909 --> 00:30:30.679 basically, that we use for every single system that becomes the 00:30:30.679 --> 00:30:35.049 first trigger station, we have a table to show, what are the 00:30:35.049 --> 00:30:39.760 potential station to be used for the likelihood calculation. 00:30:41.000 --> 00:30:46.470 One more little detail in the algorithm is that, because of the calculation time 00:30:46.470 --> 00:30:51.320 required, we cannot lay out a huge grid everywhere in Japan. 00:30:51.320 --> 00:30:57.460 So we focus on the locations around the first trigger station. 00:30:57.460 --> 00:31:01.669 So we do this little trick. We say that, okay, let’s start with 00:31:01.669 --> 00:31:09.279 a grid around the first trigger station. But, at some point, if the predicted 00:31:09.280 --> 00:31:16.060 earthquake location is quite far, or quite on the edge, on that grid, 00:31:16.060 --> 00:31:23.160 we will automatically re-shift the area of interest to be centered at this point. 00:31:23.170 --> 00:31:27.710 So this [inaudible] mechanism allowed us to focus more computation time 00:31:27.710 --> 00:31:32.289 in the area that is more likely to be relevant and only move away 00:31:32.289 --> 00:31:37.549 our computation power to the – to the different area when we 00:31:37.549 --> 00:31:39.299 think it’s necessary. 00:31:39.299 --> 00:31:44.510 So I think this also helps a lot to reduce the calculation time in our algorithm. 00:31:44.510 --> 00:31:50.630 So, in summary, this is what we are doing now. 00:31:50.630 --> 00:31:55.919 Whenever you start the whole system, you start getting information from 00:31:55.919 --> 00:32:00.279 all the stations. Whenever you see a trigger, we would do all this 00:32:00.279 --> 00:32:05.940 pending events and actual event thing. And, once there’s actually event there, 00:32:05.940 --> 00:32:10.830 we would compute the – we have some samples laid out. 00:32:10.830 --> 00:32:13.830 And it continuously updates the weight of the samples that 00:32:13.830 --> 00:32:19.340 represents how likely that sample represents a source. 00:32:19.340 --> 00:32:24.400 And every time steps, we look at the potential of, if there’s new triggers, 00:32:24.400 --> 00:32:27.669 should we merge the events, should we create a new event 00:32:27.669 --> 00:32:32.899 based on the rule that I’ve mentioned, and just keep looping. 00:32:32.899 --> 00:32:39.730 So with this kind of relatively simple algorithm, these are a few cases that 00:32:39.730 --> 00:32:45.730 I would show that this algorithm can do quite a good job. 00:32:45.730 --> 00:32:49.620 This is the foreshock of Tohoku earthquake. 00:32:51.150 --> 00:33:01.000 On the left top part is the error of the distance from our prediction 00:33:01.010 --> 00:33:04.409 to the actual epicenter. And so, as it drops to zero, 00:33:04.409 --> 00:33:08.059 it would be the best. The origin time – this difference. 00:33:08.059 --> 00:33:13.889 Also drops to zero. And the depth, unfortunately 00:33:13.889 --> 00:33:19.070 there’s a – there’s a consistent bias that we cannot predict well. 00:33:19.070 --> 00:33:23.500 And the magnitude grows, of course, with time, towards the actual 00:33:23.500 --> 00:33:26.789 magnitude predicted – published at the end. 00:33:26.789 --> 00:33:34.820 And if I show you the video of how all this sampling works, this is how it looks. 00:33:34.820 --> 00:33:39.870 The color coding is the weights, or the likelihood value of that – 00:33:39.870 --> 00:33:42.290 of that particular samples. 00:33:42.290 --> 00:33:46.720 The dark blue star is the actual epicenter. 00:33:46.720 --> 00:33:51.280 And the light blue star is the predicted mean epicenter. 00:33:51.940 --> 00:33:56.820 And, as you can see, the red triangle is the first trigger station, and the 00:33:56.820 --> 00:34:03.130 black triangles are the station chosen to be used as information inputs to 00:34:03.130 --> 00:34:08.070 calculate all these likelihood values. And, in this particular case, 00:34:08.070 --> 00:34:12.160 it’s quite fast to be converged. We start on the coast side. 00:34:12.160 --> 00:34:17.160 And, as you see, as I mentioned, there’s a shift of the grid point 00:34:17.160 --> 00:34:23.180 because there’s updating schedule when we predict the epicenter 00:34:23.180 --> 00:34:27.750 to be quite far away from our original laid-out grid point. 00:34:27.750 --> 00:34:32.750 Later on, we just converged. 00:34:32.750 --> 00:34:36.300 So this is just a demonstration of how things work. 00:34:37.000 --> 00:34:42.660 This demonstration is showing you the case of having two events at the 00:34:42.660 --> 00:34:48.820 same location but shifted in time. There are two small events. 00:34:48.820 --> 00:34:53.860 And, for JMA, this is the one that they basically misunderstood as 00:34:53.860 --> 00:34:57.950 one single event and published a false alarm. 00:34:57.950 --> 00:35:01.300 But what we are able to do is that, when a – when a second event comes 00:35:01.300 --> 00:35:05.130 out, we are able to recognize this as a second event because of the 00:35:05.130 --> 00:35:09.250 fact that the first event didn’t explain it well based on the 00:35:09.250 --> 00:35:14.250 calculation of our likelihood. And we are able to start another new 00:35:14.250 --> 00:35:20.170 event in our database and continually tracking two events at the same time. 00:35:20.170 --> 00:35:25.280 And, of course, both events, the prediction was able to 00:35:25.280 --> 00:35:30.240 converge to the – to the actual source values. 00:35:30.800 --> 00:35:36.040 You may realize the depth part is quite hard to predict most of the time. 00:35:36.040 --> 00:35:38.810 And we always have a – have a error there. 00:35:38.810 --> 00:35:43.130 But I think – I mean, the error wasn’t that big, and the uncertainty that we 00:35:43.130 --> 00:35:51.580 predicted for the depth usually is within 1 standard deviation from the true value. 00:35:51.580 --> 00:35:55.840 So that shows – kind of shows that, by using these samples, we are 00:35:55.840 --> 00:35:59.570 giving out not just a mean prediction, but also the uncertainty. 00:35:59.570 --> 00:36:06.950 And the uncertainty does have some physically meaningful value in there. 00:36:06.950 --> 00:36:13.760 So by doing this, we re-run all this historical data recorded in JMA. 00:36:13.760 --> 00:36:22.160 And we are able to reduce significantly the amount of false alarm. 00:36:22.160 --> 00:36:26.170 Well, if you add up the number, it’s not quite what I’ve shown 00:36:26.170 --> 00:36:29.400 before because I think some of the events we were not able to 00:36:29.400 --> 00:36:32.140 get the full information and re-run them. 00:36:33.780 --> 00:36:40.460 Now, I think, from now on, I just want to show more and more case study. 00:36:40.460 --> 00:36:44.660 Because I think that’s what’s interesting about this algorithm. 00:36:44.660 --> 00:36:50.160 We thought it would only work in Japan with a lot of stations because we rely on 00:36:50.160 --> 00:36:54.800 the fact that we are calculating all this uncertainty to be useful. 00:36:54.800 --> 00:37:00.510 But surprisingly, some of the cases outside Japan that does not have 00:37:00.510 --> 00:37:04.900 so much stations is kind of working not too bad. 00:37:04.900 --> 00:37:08.320 So I’m trying to show some of them here. 00:37:08.320 --> 00:37:14.700 So these are a lot done by my colleagues in Japan in Tokyo – 00:37:14.700 --> 00:37:20.040 sorry, in Kyoto University, DPRI, who’s Professor Masumi Yamada. 00:37:21.320 --> 00:37:25.320 First of all, this is the example from Nepal. 00:37:25.320 --> 00:37:30.550 What she did was to look at two sets of simulated data. 00:37:30.550 --> 00:37:37.610 One’s using only small sets of stations. And the other one – I think they have – 00:37:37.610 --> 00:37:41.940 the French also put some stations there, but they always run independently. 00:37:41.940 --> 00:37:45.120 So, similar to Japan, we have, like, Hi-net, JMA, 00:37:45.120 --> 00:37:47.040 always getting independent. 00:37:47.040 --> 00:37:52.390 What she did is look at, if you used both, versus if you only used one. 00:37:52.390 --> 00:37:54.880 There is some details that I’m not very familiar with 00:37:54.880 --> 00:37:58.390 how she created the data sets. 00:37:58.390 --> 00:38:03.970 If there is interest there, I may try to answer at the end of the presentation. 00:38:03.970 --> 00:38:06.810 But I will jump directly to the prediction. 00:38:06.810 --> 00:38:13.320 This is a case where you have only the small set of data originally there. 00:38:13.320 --> 00:38:21.140 We can converge, but it takes time, and it’s not – it’s a little bit jumping 00:38:21.140 --> 00:38:25.900 around in the middle too. And the convergence wasn’t that good. 00:38:25.900 --> 00:38:31.760 But, if I consider having two networks, then obviously the convergence 00:38:31.760 --> 00:38:36.050 becomes better. Well, I shouldn’t say “obvious,” 00:38:36.050 --> 00:38:38.820 because the color code doesn’t show that clearly. 00:38:38.820 --> 00:38:43.430 But I would like to put a note there. The color coding is actually 00:38:43.430 --> 00:38:46.800 the log likelihood. Which means that, if you take the 00:38:46.800 --> 00:38:53.430 exponential, it’s the actual probability. So it’s actually very peak around 00:38:53.430 --> 00:38:58.720 the light blue star, which is our mean prediction, in most case. 00:38:58.720 --> 00:39:03.470 So this very red time is where we have quite a bit uncertainty. 00:39:03.470 --> 00:39:08.800 But if the take the exponential, you see it’s not a uniform thing, actually. 00:39:09.680 --> 00:39:15.010 So the value of prediction is shown here. If you use two network station, 00:39:15.010 --> 00:39:18.680 of course, we know for sure it would become better. 00:39:18.680 --> 00:39:23.820 How much better, I think, in this particular case, significantly better. 00:39:24.560 --> 00:39:28.980 So we have a few more case study here. 00:39:30.700 --> 00:39:36.900 This – they are both in synthetic data and some observed data. 00:39:36.900 --> 00:39:40.790 Because of time, let me try to jump a little faster. 00:39:40.790 --> 00:39:45.250 Most of the time, we are always doing these kind of tests of, you know, how is 00:39:45.250 --> 00:39:48.720 the distribution of the stations going to affect our predictions. 00:39:48.720 --> 00:39:52.500 And that’s why you see most of the time it’s sort of 00:39:52.500 --> 00:39:57.060 looking at subsets of stations versus the full set of stations. 00:39:57.060 --> 00:40:01.660 Again, this is the synthetic data, and the prediction seems to 00:40:01.660 --> 00:40:04.920 converge not too bad. In this particular case, 00:40:04.920 --> 00:40:09.010 the depth is actually converged very well, surprisingly. 00:40:09.010 --> 00:40:13.060 This is the [chuckles] rare case study it converged very well. 00:40:13.060 --> 00:40:18.140 Well, the more interesting one is the observed data case. 00:40:20.800 --> 00:40:25.420 This is just to show the attenuation equation versus observation. 00:40:25.420 --> 00:40:29.220 It’s aligning quite well, and that’s why we expect this particular case 00:40:29.220 --> 00:40:33.060 to work pretty well with our methods. 00:40:33.060 --> 00:40:38.890 As you can see, it does converge, but it – and we are still using 00:40:38.890 --> 00:40:42.240 three picking times, so the time delay is quite big. 00:40:42.240 --> 00:40:48.380 It’s not within just a few seconds, but I think around a little bit more. 00:40:48.380 --> 00:40:50.500 Closer to 10 seconds. 00:40:52.780 --> 00:40:57.510 So here’s, once again, the video – because these time – you can see 00:40:57.510 --> 00:41:01.160 the starting grid is quite big. And it’s quite sparse because 00:41:01.160 --> 00:41:07.140 the number of stations – the distribution of station is way bigger. 00:41:07.140 --> 00:41:12.780 But we are still be able to converge somehow in this particular earthquake. 00:41:12.780 --> 00:41:15.460 And so the people there are quite happy. 00:41:16.500 --> 00:41:19.820 Well, one thing to note. You see the starting point is very sparse, 00:41:19.830 --> 00:41:23.950 and then there’s a lot of particles again. And that’s because the fact that we are 00:41:23.950 --> 00:41:29.110 actually showing upper side – the 2D, we also have particles in terms of depth, 00:41:29.110 --> 00:41:34.660 so it’s a 3D grid in reality. So once I do resampling, it’s actually spread out. 00:41:37.360 --> 00:41:42.780 And another one, somehow not working as well for this one. 00:41:46.080 --> 00:41:50.450 And this is not working as well because it’s on the coast side. 00:41:50.450 --> 00:41:54.180 And we have only one side of the station. 00:41:58.820 --> 00:42:06.140 This one, I think, is the one that we saw a obvious difference between 00:42:06.140 --> 00:42:10.110 the attenuation equation that we used to have. 00:42:10.110 --> 00:42:13.980 And so we are expecting some problem there too. 00:42:16.260 --> 00:42:22.820 But pretty much most of the cases, you can see, is that within the – if it’s inland, 00:42:22.820 --> 00:42:29.080 and if you have, like, four or five stations covered pretty well, 00:42:29.080 --> 00:42:33.660 most of the time, it works pretty well. So because of the fact that we used 00:42:33.660 --> 00:42:40.170 the uncertainty, I think it allows you to take advantage of some of – 00:42:40.170 --> 00:42:44.680 to sort of be a little bit more robust towards the data. 00:42:44.680 --> 00:42:48.430 When you have, like, say, five, six stations, and one of them 00:42:48.430 --> 00:42:51.620 kind of have some bigger error. 00:42:51.620 --> 00:42:58.380 The last cases I want to touch on that I put – is this Japan false alarm 00:42:58.380 --> 00:43:06.010 case in 2018, January 5th. Maybe here people are not very aware 00:43:06.010 --> 00:43:10.070 of this, but in Japan, it was quite big, especially for JMA. 00:43:10.070 --> 00:43:14.770 Because this is a time where we actually implemented the methods 00:43:14.770 --> 00:43:20.220 already in official system. Unfortunately, this is a false alarm. 00:43:20.220 --> 00:43:24.590 But [chuckles] I’m here. I can talk very honestly. 00:43:24.590 --> 00:43:28.620 This false alarm was not our fault. [laughter] 00:43:28.620 --> 00:43:35.580 So, why? Our system actually was – this is – this is how the P wave picking 00:43:35.580 --> 00:43:42.700 looking like when you plot the stations and the time in terms of the spatial 00:43:42.700 --> 00:43:48.620 distance. You can kind of clearly see two line, right? 00:43:48.620 --> 00:43:52.240 And so, for human, probably it’s quite easy to recognize. 00:43:52.240 --> 00:43:57.760 But, of course, this is not as clear-cut as every – as some other cases 00:43:57.770 --> 00:44:02.850 we’ve looked at. But we are actually able to get this 00:44:02.850 --> 00:44:11.710 done and recognize two events clearly. What happened was, the merging of 00:44:11.710 --> 00:44:15.500 the two system – the existing JMA prediction system 00:44:15.500 --> 00:44:20.460 versus the system we had had some problem. 00:44:20.460 --> 00:44:26.800 So, even though our system got it right, they published the results by somehow 00:44:26.800 --> 00:44:31.820 combining the two system, which the original system got it completely wrong 00:44:31.820 --> 00:44:35.740 and ends up they’re publishing a wrong result. 00:44:35.740 --> 00:44:40.510 So, from there, people get more and more alert of the fact that, 00:44:40.510 --> 00:44:44.440 how should we even think about combining multiple systems, 00:44:44.440 --> 00:44:49.680 and I think this is what Sarah has published a paper on. 00:44:49.680 --> 00:44:55.100 And I think Japan is getting more awareness on this problem too. 00:44:55.100 --> 00:44:56.680 So … 00:44:58.280 --> 00:45:03.760 In conclusion, I think – I think it’s naive to say that. 00:45:03.760 --> 00:45:07.610 Everyone knows station coverage is very important, but what I want to stress 00:45:07.610 --> 00:45:11.590 here is that, especially when – if you want to think about quantifying 00:45:11.590 --> 00:45:16.910 the uncertainty of your prediction, this is for sure even more important. 00:45:16.910 --> 00:45:19.920 So every single resources available should be used. 00:45:19.920 --> 00:45:25.550 This conclusion is particularly written and used a lot when we try to convince 00:45:25.550 --> 00:45:31.570 JMA to use more system – to use whatever station they have, 00:45:31.570 --> 00:45:33.800 even though sometimes they may feel like there’s a lot of 00:45:33.800 --> 00:45:37.830 trouble to combining the information from different system. 00:45:37.830 --> 00:45:43.180 The uncertainty is also very useful, in my opinion, in this particular case. 00:45:43.180 --> 00:45:47.020 Because most of the predictions does have quite a high uncertainty. 00:45:47.020 --> 00:45:52.960 And a lot of these station may have – it’s not always very robust. 00:45:52.960 --> 00:45:56.380 So be able to capture that when the information is little, 00:45:56.380 --> 00:46:00.300 especially in early warning system. We want to be fast, at the same time, 00:46:00.300 --> 00:46:04.620 be accurate. So the tradeoff would be – 00:46:04.620 --> 00:46:09.080 would be better done if you also utilized the uncertainty information. 00:46:09.080 --> 00:46:14.730 And also, of course, as a – as one more benefit is it would be able to help you to 00:46:14.730 --> 00:46:20.690 justify to your prediction, which means that you can use it – use this as a metric 00:46:20.690 --> 00:46:25.390 to help you to decide if there’s multiple events or not, which is how we 00:46:25.390 --> 00:46:31.340 implemented our algorithm here. So the only concern is when you try to 00:46:31.340 --> 00:46:37.650 get estimation of your uncertainty, and you want to do it kind of properly, 00:46:37.650 --> 00:46:42.210 such that the estimation is not too bad, would be to sacrifice your computation 00:46:42.210 --> 00:46:46.230 time, which is something that you don’t want to sacrifice in early warning 00:46:46.230 --> 00:46:51.200 system. So smartly figuring out some approximation methods would be 00:46:51.200 --> 00:46:57.990 a very big key here, in my opinion. And I don’t think we have done the 00:46:57.990 --> 00:47:03.660 best job possibly, but so far, the system has been working decent. 00:47:04.460 --> 00:47:10.540 But definitely, for example, the PLUM system, developed by Dr. Hoshiba 00:47:10.540 --> 00:47:15.870 in Japan, is definitely more efficient in terms of warning time 00:47:15.870 --> 00:47:19.651 in particular cases. So I think there’s things we can 00:47:19.651 --> 00:47:24.050 improve, either by improving the approximation method we have here or 00:47:24.050 --> 00:47:29.180 by combining multiple systems and do a smart thing to combine the predictions. 00:47:30.300 --> 00:47:32.540 That’s it for my talk. Thank you very much. 00:47:32.540 --> 00:47:37.620 [Applause] 00:47:37.840 --> 00:47:44.420 - Thank you, Stephen. Does anyone have any questions for our speaker? 00:47:46.560 --> 00:47:50.900 - Thanks. That was really interesting. So I understood the original description 00:47:50.910 --> 00:47:54.820 as being applied to point sources – one or two or three point sources. 00:47:54.820 --> 00:47:59.410 But does it get more complicated if it’s just one event and it’s a big 00:47:59.410 --> 00:48:03.280 finite source – a magnitude 6-point-something, for example? 00:48:03.280 --> 00:48:10.170 - Yes. It does have trouble when it’s finite event – finite fault event. 00:48:10.170 --> 00:48:18.900 What we have seen is that, for example, the main – the actual 2011 earthquake 00:48:18.900 --> 00:48:25.490 is that, because it’s propagated – so the station – the way we set our 00:48:25.490 --> 00:48:31.810 decision threshold to create a new event, usually would not trigger – at least for, 00:48:31.810 --> 00:48:35.770 so far, a few finite event that we’ve checked. 00:48:35.770 --> 00:48:39.170 But there is definitely a little sort of echo thing. 00:48:39.170 --> 00:48:42.940 How you set the threshold such that it doesn’t trigger in finite event. 00:48:42.940 --> 00:48:48.760 The alternative thing that we are thinking now is, of course, to utilize – 00:48:48.760 --> 00:48:53.620 to maybe utilize some complicated model allow us to capture the finite events. 00:48:53.620 --> 00:48:57.780 But we have figured out how to do it efficiently. 00:48:57.780 --> 00:49:02.490 Because the key thing here is we want to get uncertainty as well. 00:49:02.490 --> 00:49:08.760 And if we just implement a more complicated model in our likelihood 00:49:08.760 --> 00:49:14.560 function, which basically means a calculation of your – of your ground 00:49:14.560 --> 00:49:21.050 motion prediction model, we – every single dot there requires one calculation. 00:49:21.050 --> 00:49:27.050 So a naive way of using the model would definitely kill the system. 00:49:27.050 --> 00:49:30.670 So we’re still thinking about that. But, yeah, for the implementation, 00:49:30.670 --> 00:49:36.680 so far, we are able to find nice, tricky threshold that works fine. 00:49:37.540 --> 00:49:39.680 In Japan, at least. 00:49:41.540 --> 00:49:44.420 [Silence] 00:49:45.040 --> 00:49:49.780 - You spoke about an example in 2011, I think, of two small earthquakes 00:49:49.780 --> 00:49:52.240 that were interpreted as one big one? - Yes. 00:49:52.240 --> 00:49:56.320 - How long did it actually take in seconds to correct the 00:49:56.320 --> 00:49:58.820 original interpretation? - Uh-huh. 00:49:58.820 --> 00:50:00.880 Let me check. 00:50:04.100 --> 00:50:08.960 Correct – okay. Let me put it this way. In our system, whenever we check – 00:50:08.970 --> 00:50:13.480 whenever there’s a new trigger, we check if that new trigger belongs 00:50:13.480 --> 00:50:17.040 to the previous earthquake or not. And, in this particular – in that 00:50:17.040 --> 00:50:21.760 particular case, we are able to figure out the first trigger was already – in the first 00:50:21.760 --> 00:50:26.260 trigger of the new earthquake, it just doesn’t fit at all. 00:50:26.260 --> 00:50:33.350 So basically, the first P wave triggered is the time that we realize there’s 00:50:33.350 --> 00:50:36.840 a new earthquake. So there’s pretty much no correction afterwards. 00:50:38.900 --> 00:50:41.220 Does it make sense? 00:50:42.160 --> 00:50:46.020 So there’s one event triggered. 00:50:47.080 --> 00:50:48.480 And … 00:50:50.560 --> 00:50:52.660 I think we are almost there. 00:50:54.060 --> 00:50:55.600 Here. - Yeah. Yeah. 00:50:55.600 --> 00:51:00.830 - So there’s one event triggered at the very beginning. 00:51:00.830 --> 00:51:05.210 It’s the starting count of the time to be zero, right here. 00:51:05.210 --> 00:51:14.260 We continue to check. And suddenly, we realize the 00:51:14.260 --> 00:51:19.690 ground motion time series suddenly get a – get a new thing. 00:51:19.690 --> 00:51:25.930 And so there, we realize there’s a new – basically a new trigger there. 00:51:25.930 --> 00:51:30.530 Even though it’s already trigger station, but after certain time, it’s no longer 00:51:30.530 --> 00:51:34.780 a trigger station, right? It’s already triggered. 00:51:34.780 --> 00:51:39.220 So there’s a new trigger later on. And this new trigger comes in, 00:51:39.220 --> 00:51:44.930 and we check once again if this new trigger belongs to the existing one. 00:51:44.930 --> 00:51:49.640 And we figure out that it’s actually not. So we directly create a new event 00:51:49.640 --> 00:51:55.510 for that case. So there’s no – it’s not like we don’t know there’s a second event, 00:51:55.510 --> 00:51:57.600 and even the second event has started, 00:51:57.600 --> 00:52:00.090 and then later on, we figure out there’s a second one. 00:52:00.090 --> 00:52:04.890 We basically figure it out right away when it – when it started. 00:52:04.890 --> 00:52:07.960 Does that kind of answer your question? 00:52:11.340 --> 00:52:14.200 - Were you asking how long in 2011 it took for JMA 00:52:14.200 --> 00:52:17.640 to take back the alert and say I’m sorry? 00:52:19.220 --> 00:52:22.740 - Say I’m sorry. I don’t know about saying I’m sorry. 00:52:22.740 --> 00:52:26.470 But just simply to understand what was correct. 00:52:26.470 --> 00:52:29.760 - So I think what he’s trying to ask is, in 2011, in real time, with the actual 00:52:29.760 --> 00:52:34.940 JMA system in production, how did they let the public alert know that this 00:52:34.950 --> 00:52:37.250 had been a false alarm, that there wasn’t really a large earthquake? 00:52:37.250 --> 00:52:40.300 What is the – what is the response to a false alarm? 00:52:40.300 --> 00:52:48.020 - Oh. So basically, it is a false alert. And it’s way after everything … 00:52:48.020 --> 00:52:51.420 - [inaudible] - Oh. 00:52:52.300 --> 00:52:54.440 Sorry, I probably have to double check. 00:52:54.440 --> 00:52:57.480 I don’t know the answer right now. Yeah. 00:52:59.900 --> 00:53:02.700 - Yeah. I have a question. That was a fascinating talk. 00:53:02.700 --> 00:53:07.860 And so one question I have is, you used the uncertainty to differentiate – 00:53:07.860 --> 00:53:11.400 identify a second event, but also uncertainty could come from things 00:53:11.400 --> 00:53:16.870 like basin amplification or directivity. How do you determine that it wasn’t 00:53:16.870 --> 00:53:20.660 a site effect, for example, that produced amplification at some distance, and it 00:53:20.660 --> 00:53:24.300 really was an event that was discrete from the first event? 00:53:24.300 --> 00:53:28.890 - Yeah. This is really hard and tricky part. 00:53:28.890 --> 00:53:33.680 Because everything we’ve done is basically laid out a way to do the 00:53:33.680 --> 00:53:37.480 uncertainty quantification using so-called the Bayesian method. 00:53:37.480 --> 00:53:40.780 Or, to be more specific, just calculate the likelihood with samples so that 00:53:40.780 --> 00:53:44.460 you can get a sense of uncertainty. That’s everything we’ve done. 00:53:44.460 --> 00:53:49.620 So nothing really fancy. At the end of the day, all these issues 00:53:49.620 --> 00:53:54.680 has to be taken off by the person who is deciding how you calculate the 00:53:54.680 --> 00:53:59.840 likelihoods as accurate as possible. So what we have done is, we try to 00:53:59.840 --> 00:54:07.180 use the – we – so far, we haven’t taken into account of the local effects. 00:54:08.440 --> 00:54:14.640 But, in Japan, somehow there’s quite some nice catalog for the v30 and stuff 00:54:14.650 --> 00:54:21.570 that we somehow find a clear-cut – I think, in our paper, or in a different 00:54:21.570 --> 00:54:26.560 paper, we show that there’s quite a clear-cut – not a super clear-cut, 00:54:26.560 --> 00:54:31.600 but there’s quite a nice difference between the error coming from the 00:54:31.600 --> 00:54:35.890 local effect versus the error of the actual – for example, 00:54:35.890 --> 00:54:41.150 due to a new event or just due to something random. 00:54:41.150 --> 00:54:45.260 So we used that as a threshold. So, in short, we just pick smartly 00:54:45.260 --> 00:54:51.470 nice threshold right now in this system. But to improve that, people have been 00:54:51.470 --> 00:54:57.680 consistently trying to add in this part of the model, which basically what it’s 00:54:57.680 --> 00:55:04.000 doing is to make your threshold more clear to reduce your chance of 00:55:04.000 --> 00:55:09.440 cutting into false calculation – false cases when you set the threshold. 00:55:09.440 --> 00:55:12.920 So I guess there’s no tricky part in the algorithm we designed, but just … 00:55:12.920 --> 00:55:16.630 - Yeah. So it’s not part – it sounds like it’s an empirical separation between 00:55:16.630 --> 00:55:20.120 the uncertainty to expect from site effects versus the uncertainty to 00:55:20.120 --> 00:55:22.420 expect from two discrete sources. - Yes. 00:55:22.420 --> 00:55:24.480 - Okay. Thank you. 00:55:26.260 --> 00:55:28.660 - Are there any additional questions? 00:55:31.780 --> 00:55:34.940 Okay. With that, let’s thank our speaker again. 00:55:35.260 --> 00:55:39.760 [Applause] 00:55:39.770 --> 00:55:44.550 We will be taking our speaker out to lunch at Mardini’s after this, 00:55:44.550 --> 00:55:47.550 expecting to be back at 1:00 p.m. So if you’d like to join us, 00:55:47.550 --> 00:55:50.920 please meet us at the front, and we’ll arrange carpooling. 00:55:51.680 --> 00:56:04.280 [Silence] 00:56:04.580 --> 00:56:07.320 [inaudible conversations]