Welcome to Conservation for coding, a website focused on using coding, passive acoustic monitoring and the latest gadgets to study marine mammals in our oceans.




  • Deep learning in PAMGuard

    Southern right whale picture by Brian Skerry

    The continued increase in the storage capacity and battery life of acoustic recorders is allowing industry and academia to collect passive acoustic monitoring (PAM) data over ever larger spatial and temporal scales. With this glut of new data comes a problem – what do we do with all the hugely complex data that is collected by acoustic recorders? How do we accurately extract the already variable vocalisations of our target species within the dynamic and highly variable soundscapes? In the past, manual or semi-manual validation of data might have been an option (humans are still the best at pattern recognition) but with such large quantities of data, this is no longer feasible. We need accurate automated algorithms, but over the past decades developing such algorithms – that are applicable to a wide range of environments, species and soundscape contexts – has proven to be extremely difficult… until recently.

    Supervised machine learning is when you feed an algorithm training data (i.e. data that is labelled with the correct answer to whatever we are trying to solve) and it automatically constructs an appropriate classifier model that can then be used to analyse new, unlabelled data. Deep learning is a subset of machine learning which uses artificial neural networks to train classifiers. Deep learning is a popular buzzword today, and it has real benefits for acoustics compared to previous machine learning methods:

    • Deep learning is scalable, i.e. it increases in accuracy with the more training data you feed it. You can also “top up” classifiers with new training data if required.
    • Another important aspect of deep learning algorithms is that they automatically extract features from data. A previous machine algorithm might have required a list of features, such as peak frequency, length, amplitude, etc., but a deep learning algorithm can ingest raw spectrogram images or even waveform and work out the best features to extract for classification itself (in reality though this is a little more complicated – see below).
    • The final, and perhaps most important, aspect is that deep learning methods are now used everywhere – in advertising, in your photo apps, in the military, by NASA, etc., which has meant that a massive ecosystem of code and services have been developed around using deep learning technologies. This means we have free access to state-of-the-art deep learning tech (from Microsoft/Google, etc.) which probably has had more R&D spending in the last few years than the sum total of all funding in bioacoustics research, ever.

    In the context of practical application to PAM, deep learning allows us to train highly accurate classifiers which can cope with a large variation in temporal and spectral properties of signals within complex soundscapes. In fact, the approach is so good at acoustic analysis that it’s solving age-old analysis problems that acoustic researchers have been tackling (with little progress) for decades; one example is automatically detecting right whale calls (Shui et al. 2020).

    However, whilst deep learning has the potential to provide extremely powerful algorithms there are caveats. Running deep learning classifiers can be very processor intensive compared to more simplistic detection and classification, they require large quantities of training data, and, like all automated algorithms, they can still be vulnerable to unexpected inconsistencies in data for which they have not been trained for. In addition, despite the numerous research papers, huge code ecosystem, and hype around this undoubtedly effective approach to automated acoustic analysis, training and then running deep learning classifiers is still not straightforward or accessible, usually requiring coding in Python. These technical barriers restrict the uptake of deep learning to specialised research groups and thus it’s impact so far in marine acoustics has been limited.

    PAMGuard and Deep Learning

    PAMGuard (an opensource software for passive acoustics) has always been about making the latest signal processing algorithms, for real time and post processing, available and accessible to researchers. The modular structure of PAMGuard means that any new module should be able to integrate with existing acoustic workflows, i.e. a new module capable of running deep learning models will be able to take advantage of PAMGuard’s data management system, displays and real time functionality and could provide a powerful and accessible tool for running deep learning models.

    A year ago, as part of a postdoc at Aarhus University, Denmark, I started looking into whether this was feasible and quickly decided it was not, mainly due to one massive hurdle: every deep learning model is different. Specifically, they are coded using different libraries, accept different types of input data and have different output formats. Creating the coding architecture around that was not something achievable for one (or many postdocs)… that was until Amazon stepped in.

    Whether or not you like giant, non-tax paying, and often morally dubious tech giants, you have to hand them, they write some fantastic tools for development. One such tool is Amazon’s deep java library – long story short, it provided a library that allowed any deep learning model to be loaded using just a few lines of Java code. PAMGuard is written in Java and so Amazon’s deep java library was the perfect framework for creating a new deep learning module.

    So – the tools now existed to load and run any deep learning model easily but there was a still an issue – how do we figure out what the acoustic input data is? There’s perhaps a misconception that deep leaning algorithms just work by accepting raw wavforms or spectrograms – in reality, however, the accuracy of deep learning models is greatly improved by applying a set of transforms to the raw data; this might be cropping a spectrogram, normalizing, removing noise, etc. For PAMGuard’s deep learning module to work effectively, it had to replicate these steps in Java before passing data to a trained deep learning model. But every model is different (and uses different transforms) so how to deal with this without requiring hard coding in Java for every new model? Fortunately for me, Amazon wasn’t the only one who had come up with some useful deep learning tools. Two Python coding libraries (AnimalSpot and Ketos) had recently been released, each with a comprehensive framework to train deep learning models. Both libraries provided a relatively easy-to-use coding framework to allow researchers to clean up their spectrograms and then train deep learning models. Crucially though, for PAMGuard, the models produced by these frameworks were all the same format and contained metadata for the type of input required. This may all sound a bit technical but the upshot was that a Java library could now be created which could guarantee compatibly with any model trained in AnimalSpot and Ketos without any additional coding required!

    An acoustics deep learning model usually required a series of transforms to enhance an image (or waveform) before it is processed by the neural network. These transforms will have been used to increase the performance of the network and are often vital to ensure the deep learning model performs well. A key issue in developing the deep learning module for PAMGuard was figuring out how to replicate these transforms for a given model.

    The Raw Deep Learning Module

    The first stage in developing the deep learning module was to create a library of spectrogram transforms which replicated those in both AnimalSpot and Ketos and also add some other generally used transforms. I won’t go into detail but the majority of the work is carried out a new java library JPAM created for the project but separate to PAMGuard so that it can be used easily elsewhere.

    Next was to define some feature limits. What will this module do and not do? Most acoustic deep learning approaches so far have considered segmenting acoustic data into discrete chunks, applying the relevant data transforms to each chunk and then passing chunks to a deep learning model for prediction values (i.e. the probability a chunk of sound contains a target vocalisation). There are other approaches, but it was decided that this classifier would only accept raw acoustic data and use the segmentation approach mentioned above.

    A diagram of how the deep learning module works in PAMGuard. An input waveform is segmented into chunks. A series of transforms are applied to each chunk creating the input for the deep learning model. The transformed chunks are sent to the model. The results from the model are saved and can be viewed in real time (e.g. mitigation) or in post processing (e.g. data from SoundTraps).

    Next stage was designing the module. JPAM handled most of the deep learning and transforms heavy lifting so creating the module was mainly just plumbing in the rights bits and pieces into PAMGuard and creating a UI. The main UI was built with JavaFX and fairly straightforward, just a few controls to let users select the segment size and load a model. The models from AnimalSpot and Ketos both automatically load settings so there’s nothing much more to do for a user than select the framework they are using, browse to the deep learning model file, define a minimum prediction threshold and then run through their data in PAMGuard. When running on raw acoustic data, the module continually segments the data and the raw waveforms from any segments which pass above threshold are saved to PAMGuard files. Users can then view and export the results in PAMGuard viewer mode.

    The UI for the deep learning model is very basic. Users select a model using the browse button and everything sets up automatically.

    After a lot of coding, questions on GitHub and, of course, coffee, a prototype module was created and seemed to be working pretty well. However, it soon became apparent that there were two glaring omissions from the module… it could only run AnimalSpot and Ketos models, and it was SLOW.

    A generic framework

    It was never going to be possible to allow users to run absolutely any model, however, many of the transforms applied to segments are fairly generic and common, and so a user should be able to replicate many of the transforms sets required for a model using the existing transforms available. So, a third framework option was created – the so called “Generic Model”. Using this, users can import any model and then manually define the input transforms using an additional UI. A preview of the transforms and their final input shape is available, potentially allowing an experienced user to import a non-AnimalSpot/Ketos model and get it to work in PAMGuard. To make life easier for anyone else using these models, the transforms cam be saved to a file and then loaded up in another instance of PAMGuard. We tested this approach on a right whale classifier (Shui et al. 2020) – also see the tutorial.

    The advanced settings pane allows users to create a series of transforms for a model. Note that this is only needed if the model is not a compatible PAMGuard format.

    Using “pre-detections” to increase processing speed

    Deep learning is useful for all species but it is slow! On a typical Intel consumer chip (without a graphics card), each prediction for a segment takes around 100ms (around 5 -10ms if you have a NVIDIA graphics card or Apple M1 chip). As long as the segment hop is greater than 100ms the deep learning module will run in real time. That’s fine for a 2 second segment (e.g. right whales) but what about higher frequency species with short calls, like bats and toothed whales (both of which I study)? -there are many animal calls that are shorter than 100ms and typically we want to run at x10 the speed of real time for analysis of acoustic recordings and stable real time operation. Thankfully the answer to this problem (without buying a lot of expensive hardware) was fairly straightforward – allow the data input into the module to be from “dumb” detectors as well as raw sound data. For example, the click detector in PAMGuard detects all transients in a defined filter band. Typically the detected transients will be less than 1% of the total raw sound data. If these are input into the deep learning module they can be segmented and predictions applied in the same way as raw data. Running a “dumb” detector at a high false positive rate means most calls/clicks are detected but there is still a huge data reduction. You therefore get the advantage of more accurate deep learning classifiers without the large processing time overhead. An example of this is in the bat tutorial.

    Improvements to PAMGuard

    Of course, making a whole new module and making it work well opens a whole pandora’s box of potential updates and improvements to PAMGuard. Here’s are a list of features introduced with the new deep learning module.

    > New display engine for spectrograms and waveforms in the Time Base Display

    We used bats as an example case to build the module. Often, bat researchers use a spectrogram and waveform to look at calls. The Time Base Display in PAMGuard could show spectrograms from raw data but not spectrograms or waveforms of detections. Seems easy to implement correctly, but plotting the spectrogram and waveforms of potentially thousands of detections at different temporal scales is actually quite difficult. For example, imagine plotting a few thousand clicks detections, each with around 1000 samples… that’s a lot of points to plot even for a powerful computer! To make the display seamlessly transition, i.e. show waveforms and spectrogram zoomed out and zoomed right in so you can see the individual samples, required a whole new plotting library.

    The Time Base Display underwent some major changes to allow users to quickly scroll through waveforms and spectrograms of detected calls and deep learning detections. Users can now seamlessly scale from showing a few milliseconds to hours of data.

    > Symbol Manager

    PAMGuard can display data in a multitude of ways, for example a stem or scatter plot, spectrogram, waveforms, etc. Data are also plotted across different displays such as the map display. There has been a unified symbol manager in PAMGuard for some years – this allows users, for example, to colour symbols in the same way across the display – so a detected dolphin whistle would be the same colour plotted on a spectrogram as it would plotted on the map.

    Detections coloured by deep learning predictions. The click detector module in PAMGuard is first used to detect all transients in the acoustic data – some of these will be bats but a lot will be noise. All transient detections are passed to the deep learning module which tags each detection with a probability that it belongs to a certain species class, e.g. Daubetons, Mnaterri, Noctule or noise.

    The new symbol manager means that users can plot data based on a number of properties (e.g. peak frequency, event, classification…, etc.) – in this case the detected transients are plotted with an Amplitude (dB) axis as scatter points and colour coding by deep learning species probability.

    > Air Mode

    PAMGuard was designed for the marine environment but it can be just as useful in terrestrial studies. It’s super annoying when dB’s are referenced to 1uPa instead of 20uPa and microphones are called hydrophones, and depth is height etc. “Air Mode” changes the UI to reflect terrestrial acoustics and fixes all these things (yes, I did go through all of the PAMGuard source code and change “hydrophone” to getRecieverName())

    > MATLAB tools for extracting deep learning data

    The deep learning module creates it’s own detections which are saved in PAMGuard binary files. The MATLAB library was updated so that these can be loaded – see the tutorials for some example code plotting right whale detections using the MATLAB library.

    Next Steps

    The next step is for other folks to use test this module! There will inevitably be bugs so if you find one please do send a bug report here.

    • This module gives folk the ability to run any deep learning model in PAMGuard – but it’s only really useful if there are actually models available for people to use. The tutorials demonstrate an example of a right whale and Danish bat species models, but Ketos and AnimalSpot can be used to train pretty much any species model. It would be great if folk made these available and open source so others can use them! A deep learning model for ADDs and military sonar might be fun and would have some pretty useful conservation applications… anyone?
    • More frameworks? – have a great open source framework for training acoustic models? Get in touch if you would like it added to PAMGuard.
    • Once this module has been tested and (if it) proves useful then the next logical step to make deep learning models more accessible, will be to allow users to train models within PAMGuard itself. That way users could, for example, mark out a number of detected clicks, categorize them to species and then train a deep learning model. That, however, is a big interdisciplinary job and would require some serious funding to get right. For now, AnimalSpot and Ketos are great and I encourage folk to check them out.

    Availability and tutorials

    The deep learning module will be available in the 2.01.06 release of PAMGuard . If you can’t wait until then, here is a link to a new installer and jar file. First use the installer to install PAMGuard and then copy the jar file into the PAMGuard programs folder (it should overwrite the current jar file). Make sure any previous versions of PAMGuard are uninstalled.

    There are two tutorials on bats and right whales. Check them out to get started. Please get in touch here or on GitHub if you have any questions.


    Deep learning is a super powerful tool and there are some great acoustic focused libraries to train acoustic models. PAMGuard now provides a module to run these models in real time or post processing, allowing anyone to deploy deep learning models on their acoustic data. So let’s get training and testing some more deep learning models (preferably using Ketos or AnimalSpot) and remember to make them open source and available so the whole bioacoustics and conservation community benefits!


    Southern right whale picture by Brian Skerry

    Thanks to the Ketos and AnimalSpot teams for their support

    Thanks to Marie Roche and her group for sharing their right whale model.

  • How to make an autonomous vertical array & time-sync SoundTraps

    Guest blog post by bioacoustics PhD student Chloe Malinka, @c_malinka

    We at Coding for Conservation would like to let you know about a recent publication, authored by researchers from the Marine Bioacoustics lab at Aarhus University, the Sea Mammal Research Unit, the Bahamas Marine Mammal Research Organisation, and Ocean Instruments.

    (A very rough first draft of this paper was originally posted here as a blog post in July 2018. Due to interest and accessibility, we decided to draft it as a manuscript for publication. A couple of field seasons later, here we are, ready to share our publication with you…)

    I recently had an opportunity to study the bioacoustics of a deep-diving toothed whale. I was interested in collecting passive acoustic recordings on an array containing multiple hydrophones. With this, I planned to detect echolocation clicks, classify them, and localise them.

    The array deployed next to diving pilot whales

    However, this presented me with a challenge: how do I deploy an array and collect recordings at several hundred meters of depth, where I anticipate my animal of interest to be? With traditional star and towed arrays, the cables all connect to recording gear on the boat, whereby all channels usually get recorded on the same soundcard. If I want to go deep, ~1000 m of cable is heavy and expensive, which means a big boat is needed, which also expensive. …If only I had an autonomous array that I could set into the deep, without having to worry about its connection to a boat. Furthermore, I would need this array to be vertically oriented, and as straight as possible, to allow for minimal errors in acoustic localisations.

    I looked around the lab, and I came across a few (okay, 14) SoundTraps. These are autonomous hydrophones made by a company based out of New Zealand (Ocean Instruments). I’ve used these devices many times before and appreciated their user-friendliness, low noise floor, and large dynamic range.

    Peter & Pernille recovering the array

    I got in touch with their director, who had the foundations in place for a “Transmitter / Receiver” dynamic. This means that so long as all the devices on an array are connected with a cable, the “Transmitter” can send out an electrical signal to all of the “Receivers” on the array. These pulses are sent out at a rate of 1 per second. The one Transmitter and each Receiver records the sample number at which it either sent or received each pulse. This information is stored and can be used to time-align the audio recordings on all devices on the array after data has been collected, to sample-level accuracy. In other words, we now have a way to treat these autonomous devices as if they were collecting audio data on the same soundcard.

    How did I do this, and how can you do it, too? Check out our publication here:

    Malinka CE, Atkins J, Johnson M, Tønnesen P, Dunn C, Claridge D, Aguilar de Soto N, & PT Madsen (2020) “An autonomous hydrophone array to study the acoustic ecology of deep-water toothed whales.” Deep Sea Research I  


    – We developed an autonomous deep-water, large-aperture vertical hydrophone array using off-the-shelf components to address challenge of recording time-synchronised, high sample rate acoustic data at depth. 

    – Array recordings can be used to quantify source parameters of toothed whale clicks.

    – We report on the design and performance of the portable and lightweight array.

    – Step-by-step directions on how to construct the array, as well as an analysis library for time synchronisation, are provided.

    Downloading data after a series of array deployments

    This publication also links to the time synchronisation library on github, some research data on which this library can be trialled, and a step-by-step how to build and deploy guide in the Supplementary Materials.

    We genuinely hope that making these instructions, software, and analysis library all open will make it accessible for other researchers to employ this method.

    Questions, comments, access to publication? Get in touch.

    Guest blog post by Chloe Malinka @c_malinka

  • Finding Illegal Fish Bomb Blasts in Tanzania using Machine Learning and JavaFX


    Most research into wild marine mammals occurs in wealthy countries. Amazingly, in 2018, we have still have very little idea what species are present, let alone the population size / health status / behaviour, etc. in many parts of the world. A solid first step to address this problem is to conduct a rapid assessment survey to determine which species of marine mammals are present in a given area. The idea of a rapid assessment survey is fairly straightforward: you take a boat out and survey the entire coastline of a country using visual observers to record the number and species of any whales and dolphins encountered. As well as being large, surface present and often charismatic animals, and so possible to detect visually at relatively long ranges, dolphins and whales are also highly vocal, using sound to communicate and some species hunt/sense their surroundings with a sophisticated a bio-sonar. So, for most marine mammal surveys, it also makes sense to acoustically monitor the area we are visually surveying. We do this by towing a hydrophone (underwater microphone) array behind the survey boat. That way, if the sea is rough and the animals are difficult to spot, you can still detect the tell tale vocalizations of different species, and even localise their likely position. Back in 2015, I was part of a team led by Gill Braulik on a rapid assessment survey of marine mammals in off the coast of Tanzania. We used a combined visual and acoustic survey method to determine the species of whales and dolphins present, and their spatial distributions along the coast of Tanzania. The survey was a success, and you can find our publication on this here (Braulik et al. 2018). However, during the analysis of the acoustic data it became apparent that there was a frequently detected loud “clapping” noise. After some investigation it became apparent that these were the long range acoustic signatures of illegal “blast fishing” – a fishing technique in which a bomb is thrown into the water to kill or stun fish, causing them to rise dead to the surface, and allowing them to be quickly and easily scooped up by fishermen. The conservation implications of blast fishing include: indiscriminate killing of all species within the bomb’s range, damage to coral reefs, and significant noise pollution. We were looking for animals but discovered that our survey method also had the power to reveal how common illegal blast fishing was, and where it was happening. So we also produced another paper, this one focusing on the large number of bomb blasts that were detected. This got quite some traction in the press.
    Map of the 2015 acoustic survey (black lines) with detected bomb blasts marked as circles. (From Braulik et al. 2017).
    After the survey, other hydrophones detected the same thing: lots of bomb blasts. However, the acoustic detection of any bomb was still just opportunistic, as it was recorded in other projects which were not exclusively focused on addressing the conservation concern of blast fishing. It became clear that what was needed was a long term acoustic study which could locate the likely positions of each blast and quantify the full extent of the problem. And so, in 2018, Gill Braulik, Code4Africa and myself teamed up to do exactly that. We deployed 4 state-of-the-art recording stations along the northern coast of Tanzania. The recording stations each have 3 synchronized hydrophones linked to a state of the art recording unit (SoundTraps), allowing us to work out a bearing to a received bomb blast. If 2 or more of the 4 stations picked up a blast, a latitude/longitude location can be determined. The recording devices are based on ultra low power acoustic devices and so can be deployed easily be a team of divers, something that’s really important where you don’t have access to specialized research vessels to deploy and recover gear. The project is ongoing, so they’re still out there recording  bomb blasts and any/all other interesting sounds…
    Deploying an acoustic sensor on a Tanzanian reef. There are three hydrophones on the sensor. Photo by Johnny Miller.

    Acoustic Data Analysis

    Recently we recovered the first set of recordings. 50 days of acoustic data on a 30 minute on/off duty cycle x 3 recovered stations= 75 days. How do we process that quantity of data on a very tight budget? Thankfully, there are some open-source projects out there to help us. The first stage was to process the acoustic data using PAMGuard . PAMGuard is great at quickly churning through large datasets and picking out interesting sections. The sound a bomb blast makes is low in frequency and lasts for a significant period of time (a second or so), and so we used the Ishmael detector and Clip Generator modules to save all high amplitude, long and low frequency sounds as 3 second ‘.wav’ clips. (Note the Ishmael detector is also available in the Ishmael software package). This worked great, with thousands of short audio clips of potential bomb blasts generated. However there’s a bunch of low frequency sounds on reefs and many of them are not bombs. For example… So the next stage was to determine how to find which clips contain actual bomb blasts? With the recent advances in machine learning, it might initially seem sensible to train a neural net or other type of classifier to find bomb blast (i.e. to manually find some bombs for training data, train a classifier, and run it on the rest of the data). However there are a few issues with this. A classifier is only as good as it’s training data. So, that training data would have to be manually identified to begin with, which could be time consuming. In addition, this is very novel data. What if noise conditions change? What if there’s a species that starts vocalising during a different period of the year that confuses the classifier? To be diligent with the data analysis, even once the classifier has been trained, a manual analyst would have to check the classifier performance, at least for the first few years, by which time the project might be over. Having a manual analyst listen to all the clips form PAMGuard is also not an option, as it is still far too time consuming on a tight budget. The solution is to a take a machine-assisted approach. Rather than training a machine to make decisions, we created highly interactive tools combined with machine learning to allow a manual analyst to always have the final say. This cuts the time it take to analyse large data sets by an order of magnitude (!) but maintains the valuable human oversight (we are, after all, still the best pattern recognition and decision making machines when it comes to bio-acoustics analysis!). Achieving this requires a first run machine learning application to group clips together, followed by a highly interactive application to allow a manual analyst to inspect the groups and annotate the data.

    SoundSort. An app to quickly find bomb blasts

    The machine learning aspect is actually quite easy (thanks to more clever folks who have figured this all out already), use t-SNE (T-distributed Stochastic Neighbor Embedding) to cluster spectrogram images. t-SNE can group similar spectrograms together. This has been done before with images, and even implemented for acoustics in one of Google’s AI experiments. Great! However although the machine learning methods exist to group acoustic data, the example code to do so is in Python, which is not very accessible to many marine biology researchers and nowhere near the interactive system envisaged. So what’s required is an application that can presents the user with results in a similar way to Google’s AI experiments. JavaFX is a UI framework perfect for this task. It’s native, so can handle the graphics intensive tasks of drawing thousands of clips, and has a great set of third party libraries for additional user controls and styles. Plus it works on MacOS, Linux, PC, iOS and Android. This provided the perfect basis for building an application to perform t-SNE and allow a user to quickly and efficiently interact with the results. Before getting to the app we built, SoundSort, it should be noted that building a program like this without an entire research budget is only possible because the efforts of the open source community. Even in this relatively simple application, there are multiple libraries used:
    • A fast and native Java implementation of the t-SNE algorithm.
    • The excellent controlsfx library for extra JavaFX bits and pieces.
    • JMetro for styling the app with fluent design theme.
    • FontawesomeFX for icons.
    • javafxsvg for reading svg files.
    • Apache Commons Math 3 for the fast Fourier transform and plenty of other useful functions.
    • iirj for filtering acoustic data before decimating.
    • alg4 for the solving the assignment problem, i.e. taking clustered points from t-SNE and assigning them to a grid.
    • MatFileRW for writing and reading .mat files. This allows integration of the Java code with MATLAB/Octave.
    SountSort is fairly simple to use. A user presses the browse button and imports a bunch of ‘.wav’ clips . They can decimate them to highlight the lower frequencies (if so desired) and choose the channel to use if the files are multi-channel.
    Clips and be imported and decimated.
    The clips are presented on a grid. The user then clusters the spectrograms of all the clips using the t-SNE algorithm.
    Clips after import and before clustering.
    Once clustered, SountSort re-arranges the clips on the grid to correspond to clusters. The user can also see the raw cluster data.
    The clustered clips represented as a grid. Also shows the program in “light mode”.
    Clips after they have been clustered in the cluster graph.
    Finally, the user can zoom in and out of the grid or graph and annotate the clips. The clustering means the user can quickly zoom into section of interest and annotate relevant clips. Clicking any clip plays the clip.
    Before or after clustering clips can be annotated.
    Once done with annotations, the user can export the annotated clips to folders named by the annotation group and/or export a ‘.mat’ file with program settings and annotations. We can then work out the bearings to bomb blasts and if two of our sensors pick up the same blast, cross the bearings to get a latitude/longitude location!


    Data analysis is ongoing SoundSort has already been used to find some blast fishing. Analysis for each listening station for each deployment should now only take a few hours maximum – a far cry from the 75 days per station we started out with! ^ What a fishing bomb blast sounds like.

    More stuff to do and JavaFX 11

    There’s plenty more work to do on SoundSort. One thing that would be great is to be able to get the application working with JavaFX 11, it’s now working on JavaFX 8. JavaFX 11 is separate library not included in the JRE and seems to be relatively easy to get going using Maven. However, there are several problems getting it working here, including that any library which has not been built with Java 11 and has an invalid auto generated module name cannot be used (t-SNE in this case). Plus a bunch of the controls in ControlsFX, such as the range slider, do not work properly and there seems to be some weird version exception. Might be a case of waiting a while for things to catch up but if anyone can get a Java(FX) 11 build working give me a shout on twitter! Technically SoundSort should work with almost any type of sound, as long as it can be adequately represented on a spectrogram, e.g. tonal whale and dolphin tonal sounds, fish sounds, boat engines etc. (though not echolocation clicks). There’s also no reason small modifications could mean short sound snippets such as echolocation clicks couldn’t be fed directly into the algorithm or another type of transform (e.g. Wigner plot) used to represent the data. So there’s plenty of scope for expanding features and using this program in other projects.


    By combing a highly interactive UI with human assisted machine learning, we can process large acoustic datasets quickly and on a shoestring budget, ideal for applications in environmental conservation projects.
  • JavaFX is dead. Long live JavaFX? (from a science perspective)
    (Picture credit (@UnequalScenes)) I’m not a professional programmer- I’m a marine biologist/physicist. Working in industry and academia I appreciate that for any program, a good user interface (UI) is important, both in helping people learn how to use a piece of software and allowing for efficient interaction with data. Despite the recent strides in machine learning, when it comes to bio-acoustics data analysis (and many more fields), humans still have an edge over machines; we’re still (usually) better at pattern recognition and can spot and deal with unexpected inconsistencies in datasets- we have initiative. It’s therefore important to create programs which allow humans to interact with and manipulate data at small and large temporal scales; this allows us to explore the datasets we’ve collected, to quickly pick up on ‘weird’ things and if we’re processing using automated algorithms, to check they’re doing what we want them to do. Designing those types of programs is one of the parts of my job I really enjoy. In my first few years working in marine science I had learned Java and was helping out with a Java/Swing program to detect dolphin and whale vocalisations called PAMGuard (the subject of quite a few blog posts here). But I had been frustrated by how old Swing looked, the lack of modern controls and the fact Java3D was ailing. Colleagues of mine had recommended HTML5 and other web based UI stuff, but PAMGuard required complex 2D and 3D graphics and having experienced HTML 5 apps on phones compared to native apps, I was skeptical these albeit popular technologies would be any better than Swing. Plus I didn’t want to rewrite the entire PAMGuard GUI in one go and having everything bundled in a single jar file using a unified programming language was a lot easier. So inevitably, in 2015, after a few minutes on Google, I discovered JavaFX. It was getting a big upgrade in Java 8, it looked great and could be styled with CSS. There were modern controls and animations, it was easier to program than Swing and had a fully featured 3D library. I started out programming a basic display in PAMGuard. That snowballed and eventually got funded by NOAA in the US to make a fully featured time based data display. JavaFX was instrumental in making the display work….. It wasn’t however without it’s problems. Embedding into swing was clunky, the lack of dialogs in early releases was a glaring omission not to mention a few bugs here and there. But overall, I was pleased. Over the next few years I moved to JavaFX as my main UI programming language and made a bunch of small applications, from simulation tools, to control systems for sensors. You can see some of them below
    A simple program to receive IMU sensor data from an instrument and show rotation in real time. The device is actually part of a system to track dolphins around gill nets but this is the PC interface to check everything is working. The JavaFX 3D library makes building an application like this straightforward. The 3D model moves in real time as the sensor moves allowing you to rapidly checking everything is OK. (Thanks to JMetro library for the Metro theme used here)
    A new time data display for PAMGuard ( JavaFX and the excellent ControlsFX/ FontAwesome library helped with transitions, animations, complex images and controls making the whole thing look nicer and making it more user friendly.
    A new GUI for PAMGuard (proper post on this to come). One slow burn side project is building a completely new GUI for PAMGuard from the ground up using JavaFX. With JavaFX it’s much easier to make custom interactive displays. Here the data model in PAMGuard can be connected up by dragging plugs from different modules (colours are a little bit off in the gif).
    CetSim is a Java simulation program designed to work with MATLAB. Since MATLAB is essentially a pretty slow language, Monte Carlo type simulations can be programmed in Java, which does the hard work, reducing processing times by ~30x. I built a JavaFX GUI to check the simulation runs OK and allow users to play around with settings before calling from MATLAB. The whole thing only took a few days, because JavaFX makes it so easy to whip up nice GUI’s quickly…
    The news that JavaFX was to be decoupled from Java was initially a slight shock. A lot of investment in time had gone into learning JavaFX, partly because I saw it as “future proofed” technology. I had heard rumours that uptake hadn’t been that good but had not expected it to be removed from the main Java releases so quickly, especially as it had been slated as the successor to Swing, which happens still to be supported. After the news, I read a few interesting posts by Jonathan Giles, Johan Vos and the folks at Gluon and it seems like there’s going to be an effort to keep JavaFX alive. The consensus appears to be that to have JavaFX fully in the open source community and progressing at it’s own pace is probably a good thing, but it’s all going to depend on a bunch of folks who are far better programmers than me putting their spare time and effort into a library which is free. I wonder about those complex maintenance tasks like keeping up with displays and graphics drivers (JavaFX does not handle 4k video on Windows for example). Hopefully Gluon will be continue to be successful and help fund some of these types of jobs, keeping JavaFX up to date with the times. So a slightly rambly post, but if there’s a message it’s this. JavaFX is a great library which makes creating complex and highly interactive GUI’s a breeze. As a scientist I (and plenty of other people apparently) use it all the time and so fingers crossed this decoupling works. And finally, thanks to all those developers who are trying to keep JavaFX alive and everyone that’s contributed to all those JavaFX libraries out there, we scientists appreciate it.
  • Using PAMGuard to track the movements of marine life around a tidal turbine
    Guest blog post by bioacoustics PhD student Chloe Malinka, larrybird-2@c_malinka We at Coding for Conservation would like to let you know about a recent publication, authored by researchers from the Sea Mammal Research Unit, SMRU Consulting, and a couple of PAMGuard programmers. Our paper presents findings from the first in-situ passive acoustic monitoring array for marine mammals at an operational tidal turbine. This post demonstrates some of the analysis techniques we used with the PAMGuard/MATLAB library, specifically, providing some sample code for noise analysis. For a more detailed summary of the findings in this paper, check out the SMRU Consulting blog. We were interested in tracking the fine-scale movements of marine animals in the vicinity of an operational tidal turbine. The marine renewables industry shows great potential in terms of green energy contributions, but since it’s still in its early days, we needed to be able to check out how marine life would behave around it. Would they avoid it? Could they detect it with enough time to get out of the way? Or would they collide with the blades? To answer these questions, we decided to record the sounds made by porpoises and dolphins on enough underwater microphones (hydrophones) so that we could localise and then reconstruct the 3D tracks of these animals. We installed a passive acoustic monitoring (PAM) array, comprising 4 clusters of groups of 3 hydrophones, atop the base of the seabed-mounted turbine in Ramsey Sound, Wales. From one cluster, you could get a bearing to the animal, and when multiple clusters were ensonified, we could obtain multiple bearings to the animal; the location where these bearings from different clusters crossed would reveal the actual position of the animal. These 12 hydrophones were all connected to a data acquisition system installed inside the turbine which in turn was connected to a PC back ashore via optical fibre. The PC on shore ran the PAMGuard software to process the data in real time. Some of the hydrophones were damaged during installations so we were collecting data from 7 hydrophones at a sample rate of 500 kHz, meaning we were collecting 3.5 million data points every second… for ~3 months. Had we saved all of the raw WAV files, would come to ~55 TB (!). Instead, we used PAMGuard to detect potentially interesting sounds in real time and saved only snippets of the raw full-bandwidth data when something interesting seemed to be happening. The modules within PAMGuard that we used in our analyses included: 1) the click detector, 2) the whistle and moan detector, 3) the Large Aperture 3D Localiser module, to track the fine-scale movements of porpoises and dolphins on the array, and 4) the noise band analysis module. Here, we’ll focus on the noise band monitor. 1) Click detector We configured PAMGuard to detect both porpoises and dolphins. Clicks from the 2 species were separated using the click detector classifier which examines both the click waveform and the frequency spectrum of detected clicks. For an example of how to use PAMGuard’s MATLAB library to extract information from detected clicks, check out our previous blog post and tutorial here. 2) Whistle & moan detector We also included a whistle detector to pick up any whistling dolphins. Again, for an example of how to use PAMGuard’s MATLAB library to extract information from whistles and moans, check out our previous blog post and tutorial here. 3) Large Aperture 3D Localiser This module uses a time-delay based ‘Mimplex’ algorithm in combination with an MCMC simulation to match the same echolocation click on different hydrophones and select the most likely position of where the sound came from (described in detail in our previous publication). 4) Noise Band Monitor Whenever we’ve got PAM kit in the water, it’s important to measure noise levels in the recording environment. This helps us determine the probability of detecting an animal. For example, if the sound of an animal is only slightly louder than the background noise, then it can only be detected at very close ranges and thus the probability of detection is very low, making the PAM system an inefficient choice for monitoring. If you can’t quantify how far out you can detect a sound of interest, then you don’t have a handle of how effective your monitoring system is. To set up the Noise Band Monitor: File > Add Module > Sound Processing > Noise Band Monitor. We made octave-band noise measurements in our analysis (you can modify this to 1/3 octave noise levels instead if you like). You can also select the number of decimators, controlling the lowest frequency. Then select the analysis window (“output interval”), over which you can record both the peak and mean noise levels. Values can be displayed as band energy (dB re 1 µPa) and/or spectrum level (dB re 1 µPa/√Hz). Select a filter for each band, and visualise your configuration in a Bode Plot within the module (Figure 1). fig1

    Figure 1. Bode plot showing noise band monitor settings.

    You can visualise the noise band monitor in frequency vs. spectrum level plots and spectrum level vs. time plots, for each analysis window (Figure 2). Results are stored as binary files and the database. Display as many channels as you like. fig2

    Figure 2. Visualisation of noise band analysis in action. The screen refreshes each analysis interval and the results are saved.

    The helpfiles for “Noise Band Measurement” and “Noise Band Displays and Output” provide handy guides. Next is an example piece of code demonstrating how to extract data from the binary files, and plot a time series of noise data.

    Noise Band Monitor MATLAB Tutorial This code uses the following functions:
    • 1) findBinaryFiles
    • 2) loadPamguardBinaryFile
    These Matlab functions are in the most recent version of the PAMGuard-MATLAB library, and can be found freely here. Note that this code is backwards-compatible with previously collected binary files. For more details, see the PAMGuard website. This code will produce a figure like this from your binary noise band monitor files: demo_figure

    Figure 3. Demonstration of the output from the above code, showing a time-series of your noise band monitor results.

    So what now? Using time, you can align your noise time series to any other relevant time series. Perhaps this is tidal flow speed, or some other environmental covariate. Check out how your noises changes according to these, etc. Investigate whether noise levels in your click detector bands fluctuate, and then consider how much this will impact your detection probability. For questions with analysis, don’t hesitate to get in touch. For further details, check out our publication here:

    Malinka CE, Gillespie DM, Macaulay JDJ, Joy R, & CE Sparling (2018). First in-situ passive acoustic monitoring for marine mammals during operation of a tidal turbine in Ramsey Sound, WalesMarine Ecology Progress Series 590:247-266. (DOI:

    Guest blog post by Chloe Malinka larrybird-2@c_malinka   (Featured photo of DeltaStream turbine from