Data for both Part A and Part B¶

This project uses data from the Queensland Government Open Data Portal. Both parts will use data on Queensland Wave Monitoring. Part B will also use data on Storm tide monitoring

For this project, I will use the Coastal Data System - Near real time wave data and for Part B I will add Coastal Data System – Near real time storm tide data. Note that this data will change over the time period of the assignment.


Part A¶

QUESTION:¶

What can we learn from the wave height data for South East Queensland, and how might this data be used strategically during a major weather event?

[Q1] Read the data¶

  • Open the CSV version of the file. Open directly from the URL into a pandas dataframe.
  • Identify an appropriate index, and make a note of the columns.
In [2]:
# import library
import pandas as pd
import datetime
import pytz

# url for the latest data set
url = "https://www.data.qld.gov.au/datastore/dump/2bbef99e-9974-49b9-a316-57402b00609c?bom=True"

# read the data set to the notebook with index "_id" 
wave_df = pd.read_csv(url,index_col="_id")

# display the date of access# Get current time in local timezone
localTimezone = pytz.timezone('Australia/Brisbane')  
waveRecentAccess = datetime.datetime.now(localTimezone)
print(f"The current data was accessed on {waveRecentAccess:%d-%m-%Y %H:%M}")

# get column names
wave_headings = list(wave_df.columns)

# display column names
[noRow,noCol]=wave_df.shape # spread the shape of the data frame into two variables
print(f"There are {noRow} rows in the dataframe.")
print(f"There are {noCol} columns in the dataframe, which are:")
for col in wave_headings:
    print(">",col)
The current data was accessed on 13-04-2025 22:23
,There are 7504 rows in the dataframe.
,There are 14 columns in the dataframe, which are:
,> Site
,> SiteNumber
,> Seconds
,> DateTime
,> Latitude
,> Longitude
,> Hsig
,> Hmax
,> Tp
,> Tz
,> SST
,> Direction
,> Current Speed
,> Current Direction

[Q2] Save the data¶

  • Transform the grouped data into a dataframe
  • Save the dataframe as a CSV file with the date that reflects the URL access in Q1
In [3]:
# save the data retrieved from the internet
path="data/"
file_name_recent=f'wave_data({waveRecentAccess}).csv'
wave_df.to_csv(f'{path}{file_name_recent}')

Read the data from a file¶

  • To read the same data back in (rather than up-to-date data), write code here to read in the file from Q2
In [4]:
# Read the data locally with index "_id"
wave_file_df = pd.read_csv(f"{path}{file_name_recent}",index_col="_id")
wave_file_df
Out[4]:
Site SiteNumber Seconds DateTime Latitude Longitude Hsig Hmax Tp Tz SST Direction Current Speed Current Direction
_id
1 Caloundra 54 1743861600 2025-04-06T00:00:00 -26.84675 153.15581 0.714 1.14 6.67 5.479 25.25 85.80 -99.9 -99.9
2 Caloundra 54 1743863400 2025-04-06T00:30:00 -26.84688 153.15564 0.716 1.23 6.67 5.479 25.25 81.60 -99.9 -99.9
3 Caloundra 54 1743865200 2025-04-06T01:00:00 -26.84700 153.15555 0.677 1.20 6.67 5.634 25.15 83.00 -99.9 -99.9
4 Caloundra 54 1743867000 2025-04-06T01:30:00 -26.84697 153.15549 0.717 1.29 6.67 5.797 25.15 87.20 -99.9 -99.9
5 Caloundra 54 1743868800 2025-04-06T02:00:00 -26.84699 153.15553 0.708 1.10 6.67 5.714 25.20 87.20 -99.9 -99.9
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
7500 Hay Point TriAxys 4740tx 1744537200 2025-04-13 19:40:00 -21.27830 149.32270 1.860 2.91 5.60 4.800 26.50 119.26 -99.9 -99.9
7501 Hay Point TriAxys 4740tx 1744538400 2025-04-13 20:00:00 -21.27830 149.32270 1.940 3.15 5.90 5.000 26.48 122.26 -99.9 -99.9
7502 Hay Point TriAxys 4740tx 1744539600 2025-04-13 20:20:00 -21.27830 149.32260 1.900 3.04 5.90 5.000 26.47 114.26 -99.9 -99.9
7503 Hay Point TriAxys 4740tx 1744540800 2025-04-13 20:40:00 -21.27830 149.32260 1.890 3.21 6.20 5.200 26.45 115.26 -99.9 -99.9
7504 Hay Point TriAxys 4740tx 1744542000 2025-04-13 21:00:00 -21.27820 149.32250 1.930 3.51 5.90 4.900 26.44 114.26 -99.9 -99.9

7504 rows × 14 columns

[Q3] Analyse the data¶

  • Filter the data to include only sites for the South East coast (from Gold Coast to Sunshine Coast)
  • Group the filtered data by site
  • Obtain an appropriate aggregate for the groups (e.g. Sum, Mean, etc)
  • Save the grouped data as a new dataframe
Correct the data format¶

The "DateTime" column is converted from string to a "datetime object" for easy manipulation.

In [6]:
wave_file_df['DateTime'] = pd.to_datetime(wave_file_df['DateTime'], errors='coerce')
print(wave_file_df.dtypes)
Site                         object
,SiteNumber                   object
,Seconds                       int64
,DateTime             datetime64[ns]
,Latitude                    float64
,Longitude                   float64
,Hsig                        float64
,Hmax                        float64
,Tp                          float64
,Tz                          float64
,SST                         float64
,Direction                   float64
,Current Speed               float64
,Current Direction           float64
,dtype: object
Correct the name of sites¶

Some site names include the buoy unit names, e.g."Mk4". The unit name is technical information which is not important for the analysis. The buoy unit names are therefore removed to avoid misunderstanding.

In [7]:
wave_file_df['Site'] = wave_file_df['Site'].str.replace('Mk4','').str.rstrip()
print("> new set of site names:",set(wave_file_df["Site"]))
> new set of site names: {'Wide Bay', 'Gladstone', 'Cairns', 'Caloundra', 'Brisbane', 'Gold Coast', 'Emu Park', 'Tweed Heads', 'Townsville', 'Palm Beach', 'Mackay', 'Poruma West', 'Mooloolaba', 'Albatross Bay', 'Tweed Offshore', 'North Moreton Bay', 'Hay Point TriAxys', 'Bundaberg', 'Skardon River Outer', 'Bilinga'}
Filter data¶
  1. Retrieve Significant Wave Height and Average Zero Upcrossing Wave Period – The dataset consists of 14 columns. However, only Significant Wave Height (Hsig) and Average Zero Upcrossing Wave Period (Tz) are useful for answering the questions as they can present the wave energy.

    Significant Wave Height is used instead of Maximum Wave Height (Hmax) because the former is more general and representative. Hsig records the height of the highest one-third of waves. It represents wave that is higher than most frequent wave(Bureau of Meterology, 2015). On the other hand, Hmax represent the highest one datum. This may lead to the inclusion of extreme values that merely exists. Using it may make the analysis too sensitive and reduce the system's credibility. This shows that Hsig give more general data to the analysis. This is also the reason why the average wave period (Tz) is used instead of the peak wave eriod (Tmax).

    Hsig VS Hmax.png

    (Bureau of Meteorology, 2015)

  2. Filter to include only South East coast data – To make the study more focus, only monitoring sites along the South East Coast, from the northernmost Mooloolaba to the southernmost Tweed Offshore, are included. The data are filtered based on latitude, meaning only those with a latitude smaller than -26.56° (the latitude of Mooloolaba) are retained.

In [8]:
# Filtering data
wave_SEQ_df=wave_file_df[wave_file_df["Latitude"]<-26.56]
wave_SEQ_df = wave_SEQ_df[['Site','DateTime','Hsig', 'Tz','Longitude', 'Latitude']]
wave_SEQ_df
Out[8]:
Site DateTime Hsig Tz Longitude Latitude
_id
1 Caloundra 2025-04-06 00:00:00 0.714 5.479 153.15581 -26.84675
2 Caloundra 2025-04-06 00:30:00 0.716 5.479 153.15564 -26.84688
3 Caloundra 2025-04-06 01:00:00 0.677 5.634 153.15555 -26.84700
4 Caloundra 2025-04-06 01:30:00 0.717 5.797 153.15549 -26.84697
5 Caloundra 2025-04-06 02:00:00 0.708 5.714 153.15553 -26.84699
... ... ... ... ... ... ...
6558 Bilinga 2025-04-13 18:30:00 1.590 5.770 153.51279 -28.14245
6559 Bilinga 2025-04-13 19:00:00 1.730 6.190 153.51281 -28.14244
6560 Bilinga 2025-04-13 19:30:00 1.810 6.540 153.51277 -28.14196
6561 Bilinga 2025-04-13 20:00:00 1.720 6.160 153.51281 -28.14193
6562 Bilinga 2025-04-13 20:30:00 1.950 6.170 153.51279 -28.14193

3383 rows × 6 columns

Sort data¶

Sort data by latitude – The data are originally sorted by the alphatical order of the name of the monitoring site. To make the sorting more meaningful, the data are sorted by latitude. This arranges the data from north to south.

In [9]:
# Sorting data
wave_SEQ_df=wave_SEQ_df.copy()
wave_SEQ_df = wave_SEQ_df.sort_values(by=["DateTime","Latitude"], ascending=[True, False])
wave_SEQ_df
Out[9]:
Site DateTime Hsig Tz Longitude Latitude
_id
746 Mooloolaba 2025-04-06 00:00:00 1.044 5.797 153.18452 -26.56684
1 Caloundra 2025-04-06 00:00:00 0.714 5.479 153.15581 -26.84675
375 North Moreton Bay 2025-04-06 00:00:00 0.754 5.333 153.28159 -26.90002
4818 Brisbane 2025-04-06 00:00:00 1.070 5.270 153.63188 -27.48649
5194 Gold Coast 2025-04-06 00:00:00 0.980 6.090 153.43906 -27.96435
... ... ... ... ... ... ...
4817 Palm Beach 2025-04-13 20:00:00 2.030 5.770 153.48567 -28.09926
6561 Bilinga 2025-04-13 20:00:00 1.720 6.160 153.51281 -28.14193
4103 Tweed Heads 2025-04-13 20:00:00 2.410 6.130 153.57637 -28.17828
5571 Gold Coast 2025-04-13 20:30:00 2.330 5.840 153.43921 -27.96417
6562 Bilinga 2025-04-13 20:30:00 1.950 6.170 153.51279 -28.14193

3383 rows × 6 columns

Group the filtered data¶
  1. By Monitoring Site – The wave characteristics may differ by location. Therefore, the data of each site should be analyzed separately. To facilitate comparisons within sites, the data is grouped by monitoring sites.
  2. By date - The data are recorded every 20-30 minutes. To make the data manageable, the data are grouped by date.
Obtain appropriate aggregates for the groups¶
  1. Daily maximum Hsig and Tz – The maximum Hsig and Tz are calculated to summarize the daily wave conditions for each monitoring site. The maximum Hsig and Tz represent the highest possible values of the data, which are useful for ensuring safety and understanding peak wave activity. It is important to note that these values are derived from Hsig and Tz rather than the Hmax and Tp. It is because Hmax and Tp represent extreme cases that rarely exist. In contrast, Hsig represents one-third of the wave. It provides a general measure of the wave condition. Calculating the daily maximum of Hsig and Tz helps in capturing the frequently occurring wave conditions while avoiding the influence of extreme cases.
In [10]:
# Grouping the filtered data and obtaining daily maximum Hsig and Tz
wave_SEQ_group_max_df = wave_SEQ_df.groupby(["Site", wave_SEQ_df['DateTime'].dt.date], sort=False).agg({'Hsig': 'max','Tz': 'max','Longitude': 'first','Latitude': 'first'})
wave_SEQ_group_max_df = wave_SEQ_group_max_df.reset_index(names=['Site','DateTime'])

wave_SEQ_group_max_df
Out[10]:
Site DateTime Hsig Tz Longitude Latitude
0 Mooloolaba 2025-04-06 1.329 6.250 153.18452 -26.56684
1 Caloundra 2025-04-06 0.891 5.970 153.15581 -26.84675
2 North Moreton Bay 2025-04-06 1.053 5.797 153.28159 -26.90002
3 Brisbane 2025-04-06 1.740 6.950 153.63188 -27.48649
4 Gold Coast 2025-04-06 1.190 7.210 153.43906 -27.96435
... ... ... ... ... ... ...
67 Gold Coast 2025-04-13 2.550 5.960 153.43906 -27.96420
68 Palm Beach 2025-04-13 2.430 6.040 153.48579 -28.09923
69 Bilinga 2025-04-13 2.190 6.540 153.51283 -28.14198
70 Tweed Heads 2025-04-13 2.560 6.260 153.57632 -28.17834
71 Tweed Offshore 2025-04-13 3.040 6.570 153.68205 -28.21257

72 rows × 6 columns

[Q4] Visualise the data¶

  • Visualise the grouped data with an appropriate chart
  • Ensure X and Y axes are labelled appropriately
  • Add an appropriate title for the chart

Daily maximum Hsig and Tz trends across sites¶

To analyze the temporal trends of wave conditions, two line charts are used to display the daily maximum Hsig and Tz across different monitoring sites. Since the data is time-series, a line chart can capture changes over time.

From these two graphs, the Hsig and Tz can be compared among sites. Each monitoring site is represented by different color to better compare wave trends across locations.

In [12]:
import plotly.express as px

# the range of data
startDate=min(wave_SEQ_group_max_df["DateTime"])
endDate=max(wave_SEQ_group_max_df["DateTime"])
dateRange=f"{startDate} to {endDate}"

# significant wave height across time 
lineChartHsig = px.line(
    wave_SEQ_group_max_df, 
    x="DateTime", 
    y="Hsig", 
    color="Site", 
    title=f"Figure 1: Daily maximum significant wave height <br> of South East Queensland: {dateRange}",
     labels={"DateTime": "Date", "Hsig": "average sigificant wave height (m)"},
    width=750,
    height=500
)
lineChartHsig.show()

# zero upcrossing wave period across time 
lineChartTz = px.line(
    wave_SEQ_group_max_df, 
    x="DateTime", 
    y="Tz", 
    color="Site", 
    title=f"Figure 2: Daily maximum average zero upcrossing wave period <br> of South East Queensland: {dateRange}",
     labels={"DateTime": "Date", "Tz": "Zero Upcrossing Wave Period (s)"},
    width=750,
    height=500
)
lineChartTz.show()

Past 48 hour trend of Hsig and Tz in different sites¶

Other than comparing Hsig and Tz among site, comparing Hsig with Tz within site can also provide useful information. Hsig ang Tz of each site is represented by a separate subplot to make it clear. By plotting Hsig and Tz on the same graph, direct comparisons can be made. This allows for a comparison of how significant wave height and zero upcrossing wave period change at each site.

Only the latest 48 hours of data are displayed in these graphs. This makes it easier to observe short-term variations.

The plot includes two reference lines to assess the risk of foreshore damage. A red dashed horizontal line at 6 m indicates the reference wave height for potential foreshore damage. A blue dashed line at 12 s represents a reference wave period of increased risk. According to Evans (2007), the percentage of fatalities from wave action significantly increases when the wave period exceeds 12 seconds. These reference points help understanding the energy of waves and the potential risks caused.

In [13]:
color_map = {"Hsig": "red", "Tz": "blue"}
currentTime= datetime.datetime.now()

# Filter for the last 48 hours
wave_recent_48h = wave_SEQ_df[wave_SEQ_df["DateTime"] >= currentTime - pd.Timedelta(hours=38)]


# Create the line plot with separate subplots for each Site
lineChartHsigSites = px.line(
    wave_recent_48h,
    x="DateTime",
    y=["Hsig", "Tz"], 
    facet_col="Site",
    facet_col_wrap=3,
    height=800, 
    title=f"Figure 3: Past 48 hours Significant Wave Height (Hsig) and Zero Upcrossing Wave Period (Tz) of South East Queensland<br> (update at: {waveRecentAccess:%Y-%m-%d %H:%M})",
    labels={"DateTime": "Date", "Hsig": "Significant Wave Height (m)", "Tz": "Average Zero Upcrossing Wave Period (s)"},
    color_discrete_map=color_map,  # Apply custom colors
)

lineChartHsigSites.update_layout(
    title_font_size=15,  
    title_x=0.5,  
    legend_title_font_size=15,
)

# Add a horizontal line at 6m for Hsig
lineChartHsigSites.add_hline(
    y=6,
    line_dash="dash",
    line_color="red",
    annotation_text="6m potential damage",
    annotation_position="bottom right",
)

# Add a horizontal line at 12s for Tz
lineChartHsigSites.add_hline(
    y=12,
    line_dash="dot",
    line_color="blue",
    annotation_text="12s potential risk",
    annotation_position="top left",
)

lineChartHsigSites.update_traces(
    selector=dict(name="Hsig"),  
    name="Significant Wave Height (m)",  
)
lineChartHsigSites.update_traces(
    selector=dict(name="Tz"),  # Select Tz trace
    name="Zero Upcrossing Wave Period (s)",  # Custom legend label for Tz
)


for annotation in lineChartHsigSites.layout.annotations:
    annotation.text = annotation.text.replace("Site=", "")  # Clean up facet titles

# Show the plot
lineChartHsigSites.show()

Current Hsig and Tz value in different sites¶

The data set is updated frequently to provide near real-time information on the wave conditions. In addition to analyzing the wave conditions over the past week as stated above, monitoring the latest wave data is also crucial. A scatter plot is used to display the latest wave conditions across different sites. In Figure 4, recent Hsig is plotted against Tz for each site. If both Hsig and Tz are high at a particular site, it indicates a energetic wave and risk of coastal damage.

Two reference lines are drawn to provide a reference of risk for the data. If the point are located near both of two lines, there is a high risk of damages caused by wave at that site.

As the data update frequently, the current timestamp is included in the title to avoid confusion.

In [14]:
latestWaveData = wave_SEQ_df.groupby("Site",sort=False)[["Hsig","Tz","DateTime"]].last().reset_index()


scatterFig=px.scatter(latestWaveData, 
    x="Tz", 
    y="Hsig", 
    color="Site", 
    title=f"Figure 4: Latest Significant Wave Height (Hsig) and Average Zero Upcrossing Wave Period (Tz) <br> of South East Queensland (update at: {waveRecentAccess:%Y-%m-%d %H:%M})",
    labels={"Tz": "Average Zero Upcrossing Wave Period (s)", "Hsig": "Sigificant wave height (m)"}
)

scatterFig.update_traces(marker=dict(size=12))  
scatterFig.add_hline(
    y=6,
    line=dict(color="red", dash="dash"),
    annotation_text="6m potential damage to foreshore",
    annotation_position="top left",

 )
scatterFig.add_vline(
    x=12,
    line=dict(color="blue", dash="dash"),
    annotation_text="12s potential risk",
    annotation_position="bottom right",

 )

scatterFig.show()

[Q5] Extract Insights¶

  • Referring back to the question, what can we learn from this data (analysed and visualised above)?
  • Thinking about the kind of data, how might it be used strategically in a major weather event?
  • Who might benefit most from strategic use of this data.

What can we learn from the wave height data for South East Queensland?¶

  1. Same trends of the significant wave height and zero-coming wave perioid across South East Queensland
    From Figure 1 and 2, the line charts reveal that the daily maximum value of Hsig and Tz show the same trend across sites most of the time.

  2. Wave conditions across South East Queensland
    Figure 3 illustrates the wave height and wave period over the past 48 hours. It provides insights into wave conditions at various locations. Additionally, the sea state is classified into nine categories according to the World Meteorological Organization (WMO). By tracking the Hsig, we can assess how calm or rough the sea is at different locations over time.

    Wave Data
    (Australian Government Bureau of Meteorology)

  3. Current risk
    From Figure 4, the current risk at different sites can be assessed by the position of the data points. If a dot is located towards the top-right corner of the scatter plot and close to the 6-meter wave height 12-s wave period reference lines, it indicates a high energy wave. This condition may lead to foreshore damage. In these cases, immediate actions should be taken to mitigate the impact of the waves.

How might this data be used strategically during a major weather event?¶

To assess risk during a major weather event, check if the current significant wave height (Hsig) and zero upcrossing wave period (Tz) in Figure 4 are high the past value during flooding or major weather events.

  • Flood prediction – By comparing current and daily average wave data with past data from major floods, the authorities can assess the risk of flooding. If the current wave height and period are unusually high, government can issue flood warnings to people in high-risk areas.

  • Evacuation planning – The data helps identify which areas are most at risk of storm surges and where evacuation might be needed. It helps authorities plan for safe zones, shelters and allocation of resources.

  • Beach safety alerts & surfer warnings – Authorities can use wave data to issue safety warnings to beachgoers. If waves are too strong or dangerous, they can close the beaches.

Who might benefit nost from the data?¶

  • Disaster response agencies – The data helps predict severe weather events like cyclones and storm surges, enabling them to issue early warnings and take preventive actions.

  • Surfers – The data allows surfers to check wave conditions and plans their trips based on their skill level.

  • Beach authorities – The data helps authorities assess risks and deploy suitable safety measures.

  • Ship owners & marine operators – The data helps ship owener and marine operators to plan for docking location and measures to minimize the loss.


Part B - creating a narrative to answer significant questions¶

SCENARIO: I am responsible for providing helpful analysis to a variety of government departments (Including Police and Emergency Services). The responsible government ministers have asked my team to make data analysis plans to assist during a major weather emergency in South East Queensland. I have been tasked (within my team) to focus on waves and storm tide monitoring. My team has also suggested that I consider flood maps for relevant areas (See: Local government flood maps and data)

<<<< INSERT CELLS BELOW >>>>

Question¶

This analysis focuses on a single site Surfers Paradise in Gold Coast. It is because the flood management in Queensland is governed by local governments. Data from the local governments can vary in format and information. Concentrating on single location can ensure consistency in the data. The same anaylsis plan can be adapted to other locations with little adjustments on the data.

In this data analysis process, the following two questions will be focused.

  1. How possible flooding will occurs at Surfers Paradise in future hours?
  2. If it is possible to flood, when will it likely to occur at Surfers Paradise?

When a major weather event occurs, it may cause flooding. This analysis plans aim to predict how likely and when flooding may occur at Surfers Paradise. The result of this anaylsis help a variety of government departments to plan for services to reduce the impacts to the residents.

Data¶

In [15]:
# Import necessary libraries
import pandas as pd
import datetime
import plotly.express as px
import pytz

Read the data¶

The near-real time wave data was retrieved in Part A. In this part, the storm tide data are downloaded from the url.

In [16]:
stormTideURL="https://www.data.qld.gov.au/datastore/dump/7afe7233-fae0-4024-bc98-3a72f05675bd?bom=True"

tide_df=pd.read_csv(stormTideURL,index_col="_id")

localTimezone = pytz.timezone('Australia/Brisbane')  
tideRecentAccess = datetime.datetime.now(localTimezone)

print(f"The current data was accessed on {tideRecentAccess:%Y-%m-%d %H:%M}")

tide_headings = list(tide_df.columns)
site_list=set(tide_df["Site"])

[noRow,noCol]=tide_df.shape 
print(f"There are {noRow} rows.")
print(f"There are {noCol} columns, which are:")
for col in tide_headings:
    print(">",col)

print(f"Data are from {len(site_list)} site(s).")
print(', '.join(site_list))
The current data was accessed on 2025-04-13 22:23
,There are 65830 rows.
,There are 8 columns, which are:
,> Site
,> Seconds
,> DateTime
,> Water Level
,> Prediction
,> Residual
,> Latitude
,> Longitude
,Data are from 57 site(s).
,mackaynew, burketown, dalbay, maroochydore, morningtonA, townsvillecard, birkdale, wavebreaknc, clumppoint, cairns, stpauls, lucinda, mourilyan, seaforth, capeferg, bowen, weipanx, husseycreek, gcseaway, cooktown, tweedsbj, rabybay, russellislande, whyteislandnx, mossman, palmcove, scarborough, weipahumbug, goldcoast, portalma, bundaberg, hallsbay, theskids, portdouglas, coombabahst, bananabank, cardwell, abellpoint, noosasandstg, boigu, iama, shorncliffe, townsville, thursdayisland, karumba, ugar, warraber, kubin, tangalooma, rosslyn, urangan, southtrees, wavebreakwc, burnett, mooloolaba, russellislandw, goldenbeach

Save the data¶

The tide data is stored locally.

In [17]:
path="data/"
file_name_recent_tide=f'storm_tide_data({tideRecentAccess}).csv'
tide_df.to_csv(f'{path}{file_name_recent_tide}')

Read the data from a file¶

The data are loaded from the local copy.

In [18]:
tide_file_df = pd.read_csv(f"{path}{file_name_recent_tide}",index_col="_id")
tide_file_df
Out[18]:
Site Seconds DateTime Water Level Prediction Residual Latitude Longitude
_id
1 abellpoint 1743861600 2025-04-06T00:00 1.121 1.103 0.018 -20.2608 148.7103
2 abellpoint 1743862200 2025-04-06T00:10 1.128 1.119 0.009 -20.2608 148.7103
3 abellpoint 1743862800 2025-04-06T00:20 1.142 1.142 0.000 -20.2608 148.7103
4 abellpoint 1743863400 2025-04-06T00:30 1.157 1.172 -0.015 -20.2608 148.7103
5 abellpoint 1743864000 2025-04-06T00:40 1.175 1.207 -0.032 -20.2608 148.7103
... ... ... ... ... ... ... ... ...
65826 whyteislandnx 1744539600 2025-04-13T20:20 2.010 1.980 0.030 -27.4017 153.1574
65827 whyteislandnx 1744540200 2025-04-13T20:30 2.070 2.037 0.033 -27.4017 153.1574
65828 whyteislandnx 1744540800 2025-04-13T20:40 2.116 2.091 0.025 -27.4017 153.1574
65829 whyteislandnx 1744541400 2025-04-13T20:50 -99.000 2.141 -99.000 -27.4017 153.1574
65830 whyteislandnx 1744542000 2025-04-13T21:00 -99.000 2.186 -99.000 -27.4017 153.1574

65830 rows × 8 columns

Analysis¶

Convert the data type¶

The "DateTime" field is converted to the datetime object to enable date time manipulation.

In [19]:
tide_file_df['DateTime'] = pd.to_datetime(tide_file_df['DateTime'], errors='coerce')
print(tide_file_df.dtypes)
Site                   object
,Seconds                 int64
,DateTime       datetime64[ns]
,Water Level           float64
,Prediction            float64
,Residual              float64
,Latitude              float64
,Longitude             float64
,dtype: object

Rename the column¶

The columns of the data are rename to increase clarity.

  • Prediction: The original prediction of the water level without the effect of storm or any weather conditions. It is the astronomical tide.
  • Residual: The difference between the actual and predicted water level. The difference is casued by storm. It is the storm surge.
In [20]:
tide_file_df = tide_file_df.rename(columns={
    "Prediction": "Astronomical Tide",
    "Residual": "Storm Surge"
})

Reference framework: LAT¶

The tide data from the data set are recorded in accordance to the Lowest Astronomical Tide (LAT). Other than LAT, there are other measuring frameworks such as Australian Height Datum (AHD). To make the data comparable, one single reference framework must be used thoughout the anaylsis. Since this analysis focuses on a single site rather than comparing across multiple locations, it is not necessary to convert the data from LAT to AHD. LAT is used as the reference framework throughout this analysis unless otherwise stated.

Review the data¶

Before working on the data, a brief overview of the dataframe is noted. By examining summary statistics, abnormal or invalid values are identified.

In [21]:
print(tide_file_df.describe())
            Seconds                       DateTime   Water Level  \
,count  6.583000e+04                          65830  65830.000000   
,mean   1.744202e+09  2025-04-09 22:30:00.000000256    -14.387510   
,min    1.743862e+09            2025-04-06 00:00:00    -99.000000   
,25%    1.744031e+09            2025-04-07 23:10:00      0.737000   
,50%    1.744202e+09            2025-04-09 22:30:00      1.542000   
,75%    1.744372e+09            2025-04-11 21:50:00      2.439000   
,max    1.744542e+09            2025-04-13 21:00:00      5.994000   
,std    1.965892e+05                            NaN     37.178146   
,
,       Astronomical Tide   Storm Surge      Latitude     Longitude  
,count       65830.000000  65830.000000  65830.000000  65830.000000  
,mean           -0.027155    -15.910470    -20.949007    146.209679  
,min           -99.000000    -99.000000    -28.172100      0.000000  
,25%             0.946000     -0.022000    -27.178000    145.403200  
,50%             1.539000      0.096000    -21.176600    149.266050  
,75%             2.321000      0.173000    -16.927700    153.119600  
,max             6.029000      2.158000      0.000000    153.557700  
,std            13.145607     36.497613      6.607690     19.898097  

Clean the invalid data¶

From the above summary, some data entries are not reasonable. These data need to be cleaned to ensure the quality of the analysis.

For example, values -99 in fields minimum wave level, astronomical tide and storm surge are invalid as a wave level of -99m (i.e. -99 m below LAT) is impossible. Similarly, a longitude of 0 is invalid because they fall outside the geographic boundaries of Queensland. These data points are eliminated from the dataset.

In [22]:
tide_file_df=tide_file_df[tide_file_df["Water Level" or "Astronomical Tide" or "Storm Surge"]!=-99]
tide_file_df=tide_file_df[tide_file_df["Longitude"]!=-0]

Filter the data to include only site in Surfers Paradise¶

Since this study focuses on Surfers Paradise, only data from the monitoring site located in Surfers Paradise should be used. As shown in the map below, the monitoring site labeled "goldcoast" is the only site located in Surfers Paradise. Therefore, only data from the "goldcoast" site are used for the study.

After filtering the data, the site name is corrected to "Gold Coast".

In [23]:
tide_lat_centre = (max(tide_file_df[tide_file_df["Site"]=="goldcoast"]['Latitude']) + min(tide_file_df[tide_file_df["Site"]=="goldcoast"]['Latitude']))/2
tide_lon_centre = (max(tide_file_df[tide_file_df["Site"]=="goldcoast"]['Longitude']) + min(tide_file_df[tide_file_df["Site"]=="goldcoast"]['Longitude']))/2

tideFig = px.scatter_map(tide_file_df, lat="Latitude", lon="Longitude",
                     size_max=60, zoom=9, 
                    center={'lat':tide_lat_centre, 'lon':tide_lon_centre},text="Site")
tideFig.show()
In [24]:
targetSite="goldcoast"
properSiteName="Gold Coast"
tide_TS_df=tide_file_df[tide_file_df["Site"]==targetSite]
tide_TS_df.loc[tide_TS_df["Site"] == targetSite, "Site"] = properSiteName

tide_TS_df
Out[24]:
Site Seconds DateTime Water Level Astronomical Tide Storm Surge Latitude Longitude
_id
18161 Gold Coast 1743861600 2025-04-06 00:00:00 1.166 1.074 0.092 -27.9386 153.4326
18162 Gold Coast 1743862200 2025-04-06 00:10:00 1.219 1.109 0.110 -27.9386 153.4326
18163 Gold Coast 1743862800 2025-04-06 00:20:00 1.228 1.141 0.087 -27.9386 153.4326
18164 Gold Coast 1743863400 2025-04-06 00:30:00 1.268 1.173 0.095 -27.9386 153.4326
18165 Gold Coast 1743864000 2025-04-06 00:40:00 1.298 1.204 0.094 -27.9386 153.4326
... ... ... ... ... ... ... ... ...
19289 Gold Coast 1744538400 2025-04-13 20:00:00 1.531 1.551 -0.020 -27.9386 153.4326
19290 Gold Coast 1744539000 2025-04-13 20:10:00 1.691 1.563 0.128 -27.9386 153.4326
19291 Gold Coast 1744539600 2025-04-13 20:20:00 1.565 1.569 -0.004 -27.9386 153.4326
19292 Gold Coast 1744540200 2025-04-13 20:30:00 1.580 1.570 0.010 -27.9386 153.4326
19293 Gold Coast 1744540800 2025-04-13 20:40:00 1.490 1.566 -0.076 -27.9386 153.4326

1133 rows × 8 columns

Apart from the tide wave, the wave data is also useful in this study. The wave data from part A are used.

Similarly, only the site named "Gold Coast" is located in Surfers Paradise. The wave data are filtered to include only site "Gold Coast". Only the data from the "Gold Coast" site will be used, and the data will be filtered accordingly.

In [25]:
wave_lat_centre = (max(wave_file_df[wave_file_df["Site"]=="Gold Coast"]['Latitude']) + min(wave_file_df[wave_file_df["Site"]=="Gold Coast"]['Latitude']))/2
wave_lon_centre = (max(wave_file_df[wave_file_df["Site"]=="Gold Coast"]['Longitude']) + min(wave_file_df[wave_file_df["Site"]=="Gold Coast"]['Longitude']))/2

waveFig = px.scatter_map(wave_file_df, lat="Latitude", lon="Longitude",
                     size_max=60, zoom=9, 
                    center={'lat':wave_lat_centre, 'lon':wave_lon_centre},text="Site",color_discrete_sequence=["red"])
waveFig.show()
In [26]:
wave_TS_df=wave_SEQ_df[wave_SEQ_df["Site"]==properSiteName]

wave_TS_df
Out[26]:
Site DateTime Hsig Tz Longitude Latitude
_id
5194 Gold Coast 2025-04-06 00:00:00 0.98 6.09 153.43906 -27.96435
5195 Gold Coast 2025-04-06 00:30:00 0.94 5.95 153.43906 -27.96441
5196 Gold Coast 2025-04-06 01:00:00 0.95 6.06 153.43897 -27.96456
5197 Gold Coast 2025-04-06 01:30:00 0.93 6.11 153.43897 -27.96482
5198 Gold Coast 2025-04-06 02:00:00 0.93 5.75 153.43910 -27.96494
... ... ... ... ... ... ...
5567 Gold Coast 2025-04-13 18:30:00 2.24 5.73 153.43912 -27.96419
5568 Gold Coast 2025-04-13 19:00:00 2.31 5.66 153.43910 -27.96419
5569 Gold Coast 2025-04-13 19:30:00 2.28 5.91 153.43912 -27.96420
5570 Gold Coast 2025-04-13 20:00:00 2.27 5.96 153.43921 -27.96417
5571 Gold Coast 2025-04-13 20:30:00 2.33 5.84 153.43921 -27.96417

378 rows × 6 columns

Get the recent 48 hours data¶

Storm and weather conditions change rapidly. Analyzing 7 day data is not useful for predicting future conditions since the storm and tide condition 7 days before may not provide any insights to the future. For example, the effect of Ex-Cyclone Alfred only affect South East Queensland for 4 days.

To make the analysis more useful, only the recent 48 hour data are used for this analysis. As tidal period is approximately 24 hours, which means the astronomic tide level repeat every 24 hours. 48 hours are approximately 2 cycle of the tidal period.

In [27]:
currentTime= datetime.datetime.now()

tide_TS_recent_df=tide_TS_df[tide_TS_df["DateTime"] >= currentTime - pd.Timedelta(hours=38)] 

tide_TS_recent_df
Out[27]:
Site Seconds DateTime Water Level Astronomical Tide Storm Surge Latitude Longitude
_id
19016 Gold Coast 1744374600 2025-04-11 22:30:00 0.969 0.831 0.138 -27.9386 153.4326
19017 Gold Coast 1744375200 2025-04-11 22:40:00 0.927 0.779 0.148 -27.9386 153.4326
19018 Gold Coast 1744375800 2025-04-11 22:50:00 0.930 0.726 0.204 -27.9386 153.4326
19019 Gold Coast 1744376400 2025-04-11 23:00:00 0.837 0.676 0.161 -27.9386 153.4326
19020 Gold Coast 1744377000 2025-04-11 23:10:00 0.837 0.627 0.210 -27.9386 153.4326
... ... ... ... ... ... ... ... ...
19289 Gold Coast 1744538400 2025-04-13 20:00:00 1.531 1.551 -0.020 -27.9386 153.4326
19290 Gold Coast 1744539000 2025-04-13 20:10:00 1.691 1.563 0.128 -27.9386 153.4326
19291 Gold Coast 1744539600 2025-04-13 20:20:00 1.565 1.569 -0.004 -27.9386 153.4326
19292 Gold Coast 1744540200 2025-04-13 20:30:00 1.580 1.570 0.010 -27.9386 153.4326
19293 Gold Coast 1744540800 2025-04-13 20:40:00 1.490 1.566 -0.076 -27.9386 153.4326

278 rows × 8 columns

Integrate the storm tide components¶

Storm tide is the result of three factors: astronomical tide, storm surge, and wave setup. The total height of these components is the total storm tide height. When this total height exceeds a threshold, flooding may occur.

Therefore, to assess the likelihood of flooding at Surfers Paradise, we need data for all three components.

  • Astronomical tide: Provided by the field "Prediction" in the tide dataset.
  • Storm surge: Provided by the "Residual" field in the tide dataset.
  • Wave setup: Not directly provided by the dataset. It can be estimated using the significant wave height from the wave data in Part A. Wave setup is approximately 20% of the significant wave height (Hughes, 2016).

To calculate the total storm tide height, the dataframes from Part A (wave data) and Part B (tide data) are combined. However, the timestamps in these two datasets are not exactly aligned. Therefore, the dataframes are matched using the nearest available datetime.

Storm Tide Definition

(Australian Bureau of Meteorology & Queensland Fire and Emergency Services, 2015)

In [28]:
wave_subset = wave_TS_df[['DateTime', 'Hsig', 'Tz']]
wave_tide_merged_df = pd.merge_asof(tide_TS_recent_df, wave_subset, on='DateTime', direction='nearest')

wave_tide_merged_df['Wave Setup'] = wave_tide_merged_df['Hsig'] * 0.2

wave_tide_merged_df
Out[28]:
Site Seconds DateTime Water Level Astronomical Tide Storm Surge Latitude Longitude Hsig Tz Wave Setup
0 Gold Coast 1744374600 2025-04-11 22:30:00 0.969 0.831 0.138 -27.9386 153.4326 1.36 4.85 0.272
1 Gold Coast 1744375200 2025-04-11 22:40:00 0.927 0.779 0.148 -27.9386 153.4326 1.36 4.85 0.272
2 Gold Coast 1744375800 2025-04-11 22:50:00 0.930 0.726 0.204 -27.9386 153.4326 1.41 4.95 0.282
3 Gold Coast 1744376400 2025-04-11 23:00:00 0.837 0.676 0.161 -27.9386 153.4326 1.41 4.95 0.282
4 Gold Coast 1744377000 2025-04-11 23:10:00 0.837 0.627 0.210 -27.9386 153.4326 1.41 4.95 0.282
... ... ... ... ... ... ... ... ... ... ... ...
273 Gold Coast 1744538400 2025-04-13 20:00:00 1.531 1.551 -0.020 -27.9386 153.4326 2.27 5.96 0.454
274 Gold Coast 1744539000 2025-04-13 20:10:00 1.691 1.563 0.128 -27.9386 153.4326 2.27 5.96 0.454
275 Gold Coast 1744539600 2025-04-13 20:20:00 1.565 1.569 -0.004 -27.9386 153.4326 2.33 5.84 0.466
276 Gold Coast 1744540200 2025-04-13 20:30:00 1.580 1.570 0.010 -27.9386 153.4326 2.33 5.84 0.466
277 Gold Coast 1744540800 2025-04-13 20:40:00 1.490 1.566 -0.076 -27.9386 153.4326 2.33 5.84 0.466

278 rows × 11 columns

Obtain an appropriate aggregate¶

The maximum values of the astronomical tide, storm surge and wave setup are calculated. The rationale for using maximum values instead of averages is based on a precautionary approach. Overestimating wave energy and storm surge is safer than underestimating when evaluating flood risk.

In [29]:
wave_tide_TS_recent_max_df=wave_tide_merged_df.groupby("Site")[['Astronomical Tide','Storm Surge','Wave Setup']].max()

wave_tide_TS_recent_max_df = wave_tide_TS_recent_max_df.reset_index(names=['Site'])

wave_tide_TS_recent_max_df
Out[29]:
Site Astronomical Tide Storm Surge Wave Setup
0 Gold Coast 1.57 0.755 0.51

Visualisation¶

Visualize past storm tide trends¶

To illustrate the trend of storm tide levels over the past 48 hours, a stacked line chart is created. This visualization combine astronomical tide, storm surge and wave setup. Each component is stacked to show how they collectively form the total storm tide.

Additionally, a horizontal reference line representing the Highest Astronomical Tide (HAT) for Surfers Paradise is drawn on the chart. If the total storm tide curve is higher than this reference line, it indicates that the water level exceeds the maximum, suggesting a flood event may have occurred.

In [30]:
HAT=2.16

fig1 = px.area(
    wave_tide_merged_df,
    x="DateTime",
    y=["Astronomical Tide", "Storm Surge", "Wave Setup"],
    title=f"Figure 5: Water level of the past 48 hours in Surfers Paradise <br> (Update on {tideRecentAccess:%Y-%m-%d %H:%M})",
    labels={
        "DateTime": "Date Time",
        "value": "Water Level (m) LAT",
        "variable": "Water Level Component"
    },
)

fig1.add_hline(
    y=HAT,
    line=dict(color="red", dash="dash", width=2),
    annotation_text="HAT",
    annotation_position="top left" 
)


fig1.show()

Visualize the maximum storm tide¶

The maximum values of astronomical tide, storm surge and wave setup in the past 48 hours are visualized in a stacked bar chart. This visualization highlights the contribution of each component to the overall storm tide. It allows a clear comparison of the relative magnitudes and relationships.

In [31]:
df_long = wave_tide_TS_recent_max_df.melt(id_vars='Site', var_name='Quantity', value_name='Height')


fig2 = px.bar(
    df_long,
    width=500,
    x='Site',
    y='Height',
    color='Quantity',  
    barmode='stack',
    title=f'Figure 6: Predicted maximum water level <br> (Update on {tideRecentAccess:%Y-%m-%d %H:%M})',
    labels={
        "Site": "",
        "Height": "Water Level (m LAT)",
        "Quantity": "Water Level Component"
    },
)

fig2.add_hline(
    y=HAT,
    line=dict(color="red", dash="dash", width=2),
    annotation_text="HAT",
    annotation_position="top left" 
)

fig2.show()

Insight¶

With the above data and figures, we can answer the questions:

  1. How possible flooding will occurs at Surfers Paradise in future hours?
  2. If it is possible to flood, when will it likely to occur at Surfers Paradise?

Assess flood risk¶

Flooding risk is not solely determined by the height of the storm tide. The same storm tide height can cause different consequences to different locations. The tolerance of infrastructures and buildings also pay a critical role.

According to the Queensland Development Code (Queensland Government, 2013), buildings are required to be constructed to withstand floods up to the Defined Flood Level (DFL). However, DFL only applys to building. It does not show the tolerance of beaches. Surfers Paradise as a location famous of beaches, the safety of the beaches is also important.

To provide a general benchmark, the Highest Astronomical Tide (HAT) is used instead. HAT is the maximum tide height expected under non-storm conditions. Beaches are designed to withstand tides up to HAT, making it a practical threshold for assessing flood risk. The HAT of Gold Coast is 2.16 m LAT (Australian Bureau of Meteorology & Queensland Fire and Emergency Services, 2015). If the predicted storm tide exceeds HAT, the site is at a higher risk of flooding or causes danger to the beaches.

Sometimes even the water level exceeds HAT, it does not necessarily impact residents. This is because DFL is set higher than HAT. Additionally, statutory DFLs vary across different building areas. However, providing data and announcing predicted water levels remains valuable. It allows residents to assess potential flood risks and take necessary precautions.

The beachmark in different sites can be different. To adopt this anaylsis plan to other locations, the HAT can be changed.

Show records exceed HAT in the past 48 hours¶

The past records is useful for assessing the potential flood risk in the coming period. If there are storm tide records over HAT in the past 48 hours, there is a possibility that the location was affected by weather events in the past. If the weather events continue, there will be a higher possiblilty that the water level will be higher than usual in the future.

Figure 5 above shows the past 48 hours data graphically. The following part provide more details. The percentage of records where the storm tide exceeds HAT relative to the total number of records is calculated and shown. The percentage reflect the percentage of time in the past 48 hours may be flooding.

If there are certain time having water level over HAT in the past 48 hours, it will display the percentage of the data to all the data in red color. The date time and the corresponding exceedance will be shown as the following example. Otherwise, the data will be shown in green color.

Example output (the data is not accurate):

image.png

Additionally, this data can used for testing the credibility of the system. If reported flooding occured match with the date time suggested by these records, it would validate HAT as a reliable reference point for predicting flooding risks.

In [32]:
wave_tide_merged_df["Storm tide"]=wave_tide_merged_df["Astronomical Tide"]+wave_tide_merged_df["Storm Surge"]+wave_tide_merged_df["Wave Setup"]
wave_tide_merged_df

exceeds_HAT_df = wave_tide_merged_df[wave_tide_merged_df['Storm tide'] >= HAT ].copy()

exceeds_HAT_df['exceedance'] = exceeds_HAT_df['Storm tide'] - HAT

noExceed=exceeds_HAT_df.shape[0]
totalRecord=wave_tide_merged_df.shape[0]
percentage=noExceed/totalRecord

hadFlooding=noExceed > 0

if hadFlooding :
    display(HTML(f'<h4 style="color:red; padding:50px">{percentage:0.00%} of the records exceeds HAT in the past 48 hours:</h4>'))
    print(exceeds_HAT_df[['DateTime', 'Storm tide', 'exceedance']])
else:
    display(HTML('<h4 style="color:green; padding:50px">No records exceeding HAT in the past 48 hours</h4>'))

2% of the records exceeds HAT in the past 48 hours:

               DateTime  Storm tide  exceedance
,127 2025-04-12 19:40:00       2.165       0.005
,138 2025-04-12 21:30:00       2.185       0.025
,198 2025-04-13 07:30:00       2.633       0.473
,201 2025-04-13 08:00:00       2.248       0.088
,272 2025-04-13 19:50:00       2.259       0.099

Predict whether flooding will happen¶

The percentage of records exceed HAT in the past 48 hours is a reference but it may not present the whole picture of the future flooding risk.

Flooding is likely to occur when high astronomical tide coincides with a strong storm surge. If a strong storm surge occurs during a low tide, it may not result in flooding. However, if the storm surge continues and aligns with a high tide, the risk of flooding increases.

To assess the likelihood of future flooding, we assume that the storm and wave conditions will continue into the near future. We assume that the predicted storm surge and predicted wave setup will equal to the maximum values in the past 48 hours.

We sum up the predicted storm surge and predicted wave setup with the high astronomical tide level to find the maximum possible water level in the future. If the total height exceeds HAT, it indicates high risk of flooding. If it remains below HAT, the likelihood of flooding is considered low.

Figure 6 shows this predicted highest water level. If the bar is higher than the red reference line HAT, it shows the high possibility of flooding. The data below provide more details about the prediction.

In [33]:
highestPossible=wave_tide_TS_recent_max_df["Astronomical Tide"]+wave_tide_TS_recent_max_df["Storm Surge"]+wave_tide_TS_recent_max_df["Wave Setup"]

possibleHigherThanHAT=round(highestPossible[0] - HAT,1)>0

if (possibleHigherThanHAT):
    display(HTML('<h4 style="color:red; padding:50px">If the storm surge and wave conditions remain, it is possible to flood in Surfers Paradise.'))
else:
    display(HTML('<h4 style="color:green; padding:50px">If the storm surge and wave conditions remain, it is not possible to flood in Surfers Paradise in the future 12 hours.</h4>'))

If the storm surge and wave conditions remain, it is possible to flood in Surfers Paradise.

Predict when will the flood occur¶

If it is possible for flood to happen in the future, the date and time for the flood to occur is important for preparation.

Astronomical tide data from the past 24 hours are used to predict tide level of the next 24 hours as tidal patterns repeat daily. The future 24 hours tide level are added up to the predicted storm surge and wave. The nearest time when a flood is possible to occur is shown below as the exmaple output

Example output (the data is not accurate):

image.png

In [34]:
date_range=''
if (possibleHigherThanHAT):
    exceedance_list = []
    
    for index, row in wave_tide_merged_df[wave_tide_merged_df["DateTime"] >= currentTime - pd.Timedelta(hours=14)].iterrows():
        total_water_level = row["Astronomical Tide"] + wave_tide_TS_recent_max_df["Storm Surge"] + wave_tide_TS_recent_max_df["Wave Setup"]
        if total_water_level[0] > HAT:
            new_datetime = row["DateTime"] + pd.Timedelta(hours=24)
            exceedance_list.append({"DateTime": new_datetime, "Exceedance": round(total_water_level[0] - HAT,1)})
    
    exceedance_df = pd.DataFrame(exceedance_list)

    
    if not exceedance_df.empty:
        date_range = exceedance_df["DateTime"].min()
        exceedance_range = (exceedance_df["Exceedance"].min(), exceedance_df["Exceedance"].max())
        display(HTML(f'<h4 style="color:red; padding:50px">Flooding may happen from {date_range:%Y-%m-%d %H:%M}.</h4>'))
        display(HTML(f'<h4 style="color:red; padding:50px">The highest water level go {exceedance_range[1]:0.1f} m above HAT.</h4>'))

Flooding may happen from 2025-04-13 22:30.

The highest water level go 0.7 m above HAT.

Predict with the trend of the storm surge¶

The above prediction assume the storm and wave conditions remain as the past 48 hours in the future. However, this assumption may be false. The storm surge can also be building up or weakening.

To address this uncertainty, we assess the trend of the storm surge over the past 24 hours. If there is a significant upward trend, it may indicate a growing risk.

The trend is evaluated by calculating the mean rate of change in storm surge. To eliminate insignificant fluctuations, the mean change is rounded to one decimal place. A positive and consistent increase may suggest an elevated risk of flooding in the near future.

In [35]:
trend_df = wave_tide_merged_df[wave_tide_merged_df["DateTime"] >= currentTime - pd.Timedelta(hours=14)].copy()
trend_df['change'] = trend_df['Storm Surge'].diff()

meanChange = round(trend_df['change'].mean(), 2)

surgeBuildUp=meanChange > 0

if meanChange > 0:
    display(HTML('<h4 style="color:red; padding:50px"> The storm surge shows an upward trend in the past 24 hours, indicating that it may be building up. </h4>'))
    trend="building up"
elif meanChange < 0:
    display(HTML('<h4 style="color:green; padding:50px">> The storm surge shows a downward trend in the past 24 hours, indicating that it may be weakening. </h4>'))
    trend="weakening"
else:
    display(HTML('<h4 style="padding:50px"> There is no significant change in the storm surge over the past 24 hours. </h4>'))
    trend="no signficant change"

There is no significant change in the storm surge over the past 24 hours.

Summary¶

This part provide a summary of the above insight.

In [36]:
flood_report = f"""
<div style="">
    <h2>Gold Coast Surfers Paradise Flood Report and Forecast</h2>
    <p><strong>In the past 48 hours:</strong> 
        <span style="color: {'red' if hadFlooding else 'green'};">{percentage:0.00%}</span> 
        records of storm tide exceeding the Highest Astronomical Tide.
    </p>
    <p><strong>If the storm surge continues:</strong> There 
        <span style="color: {'red' if possibleHigherThanHAT else 'green'};">
            {'will' if possibleHigherThanHAT else 'will not'}
        </span> 
        be potential flooding 
        <strong>{'from '+date_range.strftime('%Y-%m-%d %H:%M') if possibleHigherThanHAT else ''}</strong>.
    </p>
    <p><strong>The storm surge trend:</strong> 
        <span style="color: {'red' if surgeBuildUp else 'green'};">{trend.capitalize()}</span>.
    </p>
</div>
"""

display(HTML(flood_report))

Gold Coast Surfers Paradise Flood Report and Forecast

In the past 48 hours: 2% records of storm tide exceeding the Highest Astronomical Tide.

If the storm surge continues: There will be potential flooding from 2025-04-13 22:30.

The storm surge trend: No signficant change.

Limitations¶

There are limitations for this anaylsis.

  • This analysis uses the Highest Astronomical Tide (HAT) as the threshold for flooding. However, this may not be the most appropriate benchmark. A more reliable threshold could be established by analyzing historical flood data and comparing the water levels at the time flooding actually occurred.
  • Using the mean change in water level to determine trends of the storm surge may be an over simplistic approach. Other meteorological data may be used to predict the trend of the storm.
  • Relying on tide data from the past 24 hours to predict the next 24 hours may not an effective forecasting method. Utilizing official prediction data from the Bureau of Meteorology would provide a more accurate forecast.
  • Potential measurement errors from the monitoring site were not accounted for in this analysis.

Reference¶

Australian Bureau of Meteorology, & Queensland Fire and Emergency Services. (2015). Tropical Cyclone Storm Tide Warning Response System Handbook (12th ed.). https://www.disaster.qld.gov.au/__data/assets/pdf_file/0029/346358/Storm-Tide-Handbook.pdf

Bureau of Meteorology. (2015, December 17). Ruling the waves: How a simple wave height concept can help you judge the size of the sea. Social Media Blog - Bureau of Meteorology. https://media.bom.gov.au/social/blog/870/ruling-the-waves-how-a-simple-wave-height-concept-can-help-you-judge-the-size-of-the-sea/

Evans , J. (2007). Dangerous Waves – Long Period Swell. Coastal Conference; Bureau of Meteorology. https://www.coastalconference.com/2007/papers2007/Julie%20Evans.doc

‌Hughes, M. (2016). Coastal waves, water levels, beach dynamics and climate change Wave generation in the ocean. https://coastadapt.com.au/sites/default/files/factsheets/T3I4_Coastal_waves.pdf

Queensland Government. (2013). Queensland Development Code. Queensland Government. https://www.epw.qld.gov.au/__data/assets/pdf_file/0015/4263/mandatory3.5constructionofbuildingsinfloodhazardareas.pdf