Background
Clausen & Nickisch[1] showed that relatively standard, off-the-shelf machine learning tools can be used to effectively and automatically classify auroral images. On this website you will find information about the tools and the training dataset used, and the code with which you can replicate the results.
In Clausen & Nickisch[1] the following auroral classification was introduced:
Label | Explanation | class6 | class2 |
arc | This label is used for images that show one or multiple bands of aurora that stretch across the field-of-view; typically, the arcs have well-defined, sharp edges. |
0 | 0 |
diffuse | Images that show large patches of aurora, typically with fuzzy edges, are placed in this category. The auroral brightness is on the order of that of stars. |
1 |
discrete | The images show auroral forms with well-defined, sharp edges, that are, however, not arc-like. The auroral brightness is high compared to that of stars. |
2 |
cloudy | The sky in these images is dominated by clouds or the dome of the imager is covered with snow. |
3 | 1 |
moon | The image is dominated by light from the Moon. |
4 |
clear/noaurora | This label is attached to images which show a clear sky (stars and planets are clearly visible) without the appearance of aurora. |
5 |
Download
You can download the training dataset together with the Python3 code that trains the ridge classifier here (about 500MB). It is a tar archive (SHA256SUM here) that, once unpacked, creates the following directory structure and files:
oath/
|
+- 00_README
|
+- classification/
| |
| +- classification.csv
| |
| +- train_test_split.csv
|
+- code/
| |
| +- ridge.py
| |
| +- rotate.sh
|
+- features/
| |
| +- auroral_feat.h5
|
+- images/
|
+- cropped_scaled/
| |
| +- 00001.png
| |
| +- 00002.png
| |
| +- ...
| |
| +- 05824.png
|
+- files_origin.csv
00_README |
A text file containing this installation information and lisense information |
classification.csv |
Each line of this file contains information about the image files: numeric class (2 classes), numeric class (6 classes), image index number, label, rotation angle |
train_test_split.csv |
This files contains 5 lines of each 5824 elements. These elements are the randomized index numbers of the images. As can be seen from ridge.py, the contents of the file can be used for the splitting of the annotated dataset into a training and a test dataset. Including the indeces for each dataset makes it easier in the future to compare the preformance of different maschines. |
ridge.py |
Python code that trains a ridge classifier using the feature vectors extracted from all images of the training dataset |
rotate.sh |
A bash script that rotates the original images from the oath/images/cropped_scaled folder and places them in a new folder called oath/images/cropped_scaled_rotated. |
auroral_feat.h5 |
HDF5 file containing the feature vectors for the training dataset |
files_origin.csv |
Each line of this file contains the original source of each image in the training dataset: the THEMIS ASI station abbreviation, the date and time the image was taken, and its file path in the oath directory which contains the image index number (00001, 00002, etc) |
00001.png |
THEMIS ASI image, cropped and scaled |
Installation
Here we describe the installation of the necessary components to replicate the training of a ridge classifier based on auroral feature detection as described in Clausen & Nickisch [1]. This installation was tested on a fresh install of Ubuntu 17.10 (Artful Aardvark), Kernel 4.13.0-37 generic, x86_64, running on a laptop with a four-core Intel Core i7-3520-M CPU (2.9 GHz).
In the following examples the OATH tarball was extracted in the user's home directory ~/.
- Make sure Python3, git, wget, and imagemagick are installed
sudo apt install python3 git wget imagemagick
- Install several Python3 packages for TensorFlow™
sudo apt install python3-pip python3-dev python3-h5py python3-contextlib2
- Install several Python3 packages for machine learning
sudo apt install python3-matplotlib python3-pandas python3-sklearn
- Install TensorFlow™
mkdir ~/tensorflow/
cd ~/tensorflow
pip3 install tensorflow
git clone https://github.com/tensorflow/models/
cd models/research/slim
sudo python3 setup.py install
- Download pre-trained Inception model checkpoint
cd ~/tensorflow
mkdir checkpoints
cd checkpoints
wget http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz
tar xf inception_v4_2016_09_09.tar.gz
- Install TF_FeatureExtraction
cd ~/tensorflow
git clone https://github.com/tomrunia/TF_FeatureExtraction
Running the feature detection
This part assumes that the cropped, and scaled images are in the folder ~/oath/images/cropped_scaled (see download above). First, the images are rotated and placed in the folder ~/oath/images/cropped_scaled_rotated. Then, the TF_FeatureExtraction extracts the feature vectors and writes them into a HDF5 file called auroral_feat.h5 in the directory ~/oath/features/. On the laptop mentioned above (four-core Intel Core i7-3520-M) this takes about one hour.
- Rotate the images
cd ~/oath/code
chmod a+x rotate.sh
./rotate.sh
- Run feature extraction
cd ~/tensorflow/TF_FeatureExtraction
# this is one long command
python3 example_feat_extract.py --network inception_v4 --checkpoint ../checkpoints/inception_v4.ckpt
--image_path ~/oath/images/cropped_scaled_rotated/ --out_file ~/oath/features/auroral_feat.h5
--layer_names Logits
- Train the ridge classifier
cd ~/oath/code
python3 ridge.py
- This should produce the following output:
0.8174012593016601 0.010271527444147862
[[139 27 66 0 0 14]
[ 37 222 58 1 0 16]
[ 36 37 335 3 0 3]
[ 0 1 5 224 3 2]
[ 0 1 3 2 183 1]
[ 7 13 7 1 1 299]]
|
Comments & questions
Comments and questions can be directed to Lasse Clausen
References
If you use the Oslo Auroral THEMIS dataset, please refer to:
Clausen, L. B. N., & Nickisch, H. (2018). Automatic classification of auroral images from the Oslo Auroral THEMIS (OATH) data set using machine learning. Journal of Geophysical Research: Space Physics, 123, https://doi.org/10.1029/2018JA025274
Acknowledgements & copyright
Unless stated otherwise, all data in the OATH Dataset is licensed under a Creative Commons 4.0 Attribution License (CC BY 4.0) and the accompanying source code is licensed under a BSD-2-Clause License.
In particular, all actual image data included in the tarball are modified from the THEMIS all-sky imagers. We thank H. Frey for giving us permission to include these data. Copyright for these data remains with NASA.
We acknowledge NASA contract NAS5-02099 and V. Angelopoulos for use of data from the THEMIS Mission. Specifically: S. Mende and E. Donovan for use of the ASI data, the CSA for logistical support in fielding and data retrieval from the GBO stations, and NSF for support of GIMNAST through grant AGS-1004736.
|