Usage
All metadata and features for all tracks are avaiable to browse through this website. You can also download the original CSV files;
ffma.metadata.tar (1.364 GB) can be downloaded from 6 miners.
Choose one:
Choose one:
-
lotus client retrieve --maxPrice 0 --miner f0838684 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
-
lotus client retrieve --maxPrice 0 --miner f0757233 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
-
lotus client retrieve --maxPrice 0 --miner f01392893 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
-
lotus client retrieve --maxPrice 0 --miner f01337533 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
-
lotus client retrieve --maxPrice 0 --miner f030379 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
-
lotus client retrieve --maxPrice 0 --miner f0694396 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
The metadata contains:
tracks.csv
: per track metadata such as ID, title, artist, genres, tags and play counts, for all 106,574 tracks.genres.csv
: all 163 genres with name and parent (used to infer the genre hierarchy and top-level genres).features.csv
: common features extracted with librosa.echonest.csv
: audio features provided by Echonest (now Spotify) for a subset of 13,129 tracks.
There's a few scripts ready to be used with this dataset;
usage.ipynb
: shows how to load the datasets and develop, train, and test your own models with it.analysis.ipynb
: exploration of the metadata, data, and features. Creates the figures used in the paper.baselines.ipynb
: baseline models for genre recognition, both from audio and features.features.py
: features extraction from the audio (used to createfeatures.csv
).webapi.ipynb
: query the web API of the FMA. Can be used to update the dataset.creation.ipynb
: creation of the dataset (used to createtracks.csv
andgenres.csv
).creation.py
: creation of the dataset (long-running data collection and processing).utils.py
: helper functions and classes.
Using the scripts and the data
1. Clone the scripts available in the Github repository;
git clone https://github.com/mdeff/fma.git
cd fma
2. Make sure you have a Python 3.6 environment running - skip this step if you already have it.
# with https://conda.io
conda create -n fma python=3.6
conda activate fma
# with https://github.com/pyenv/pyenv
pyenv install 3.6.0
pyenv virtualenv 3.6.0 fma
pyenv activate fma
# with https://pipenv.pypa.io
pipenv --python 3.6
pipenv shell
# with https://docs.python.org/3/tutorial/venv.html
python3.6 -m venv ./env
source ./env/bin/activate
3. Install the projects' dependencies.
pip install --upgrade pip setuptools wheel
pip install numpy==1.12.1
pip install -r requirements.txt
4. Search for the data you want to use (or download everything)
You can do so by browsing our search page.
5. Retrieve the data from the Filecoin network using Lotus
This step uses Lotus - a tool written in Go to communicate with the Filecoin network. Please follow the instructions to install Lotus. A minimum of a light node is required.
lotus client retrieve --maxPrice <max_in_fil_to_pay> --miner <miner_id> <CID> <absolute-path-to-outfile>
For example;
lotus client retrieve --miner f01392893 --maxPrice 0.01 bafyreihk56cdtms3p7lcdej3zkqmgmlkivx4zkkf3srv4pqoxh7x36nfba ~/ffma.1.tar
Make sure you use an absolute path! Local paths currently are not supported properly.
This will start downloading the files. If all is well, you will see a response like this;
$ lotus client retrieve --maxPrice 0 --miner f01392893 bafyreihk56cdtms3p7lcdej3zkqmgmlkivx4zkkf3srv4pqoxh7x36nfba ~/ffma.1.tar
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
> Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)
> Recv: 0 B, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted)
> Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelSkip (DealStatusOngoing)
> Recv: 234 B, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 52.24 KiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 1.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 2.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 3.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
......
> Recv: 3.928 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 3.929 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 3.93 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing)
> Recv: 3.93 GiB, Paid 0 FIL, ClientEventComplete (DealStatusCheckComplete)
> Recv: 3.93 GiB, Paid 0 FIL, ClientEventWaitForLastBlocks (DealStatusWaitingForLastBlocks)
> Recv: 3.931 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitingForLastBlocks)
> Recv: 3.931 GiB, Paid 0 FIL, ClientEventAllBlocksReceived (DealStatusFinalizingBlockstore)
> Recv: 3.931 GiB, Paid 0 FIL, ClientEventBlockstoreFinalized (DealStatusCompleted)
Success
You can verify the checksum by running an md5sum and comparing it to the checksum listed here.
You can also automatically verify it;
echo "checksum filename" | md5sum -c
Example:
$ echo "35ad94da86e49be46cbc2ec55dd0188b ~/ffma.1.tar" | md5sum -c
~/ffma.1.tar: OK
6. Extract your files
tar -xvf ffma.1.tar
7. Add an environment file (.env)
echo "AUDIO_DIR=./data/fma/ # the path to a decompressed tar file" > .env
8. Open Jupyter or run a notebook
jupyter notebook
make usage.ipynb
Start searching »