Usage

All metadata and features for all tracks are avaiable to browse through this website. You can also download the original CSV files;

ffma.metadata.tar (1.364 GB) can be downloaded from 6 miners.
Choose one:
  • lotus client retrieve --maxPrice 0 --miner f0838684 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
  • lotus client retrieve --maxPrice 0 --miner f0757233 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
  • lotus client retrieve --maxPrice 0 --miner f01392893 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
  • lotus client retrieve --maxPrice 0 --miner f01337533 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
  • lotus client retrieve --maxPrice 0 --miner f030379 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar
  • lotus client retrieve --maxPrice 0 --miner f0694396 mAXAAehIwCiYBcKDkAiAHSiXzPr5y1iQGx3sPqV4uFCmCFtj90053JKrcHif4bhIAGIugg4AEEjAKJgFwoOQCIBwJcYeExZplYL0AbAxvkFrsqtXCHIUQcaqHnC7yVBKMEgAYz9fAugEKFAgCGIDAv7oFIICAgIAEIIDAv7oB ~/Downloads/ffma.metadata.tar


The metadata contains:

There's a few scripts ready to be used with this dataset;
  1. usage.ipynb: shows how to load the datasets and develop, train, and test your own models with it.
  2. analysis.ipynb: exploration of the metadata, data, and features. Creates the figures used in the paper.
  3. baselines.ipynb: baseline models for genre recognition, both from audio and features.
  4. features.py: features extraction from the audio (used to create features.csv).
  5. webapi.ipynb: query the web API of the FMA. Can be used to update the dataset.
  6. creation.ipynb: creation of the dataset (used to create tracks.csv and genres.csv).
  7. creation.py: creation of the dataset (long-running data collection and processing).
  8. utils.py: helper functions and classes.



Using the scripts and the data

1. Clone the scripts available in the Github repository;

git clone https://github.com/mdeff/fma.git cd fma

2. Make sure you have a Python 3.6 environment running - skip this step if you already have it.

# with https://conda.io conda create -n fma python=3.6 conda activate fma # with https://github.com/pyenv/pyenv pyenv install 3.6.0 pyenv virtualenv 3.6.0 fma pyenv activate fma # with https://pipenv.pypa.io pipenv --python 3.6 pipenv shell # with https://docs.python.org/3/tutorial/venv.html python3.6 -m venv ./env source ./env/bin/activate

3. Install the projects' dependencies.

pip install --upgrade pip setuptools wheel pip install numpy==1.12.1 pip install -r requirements.txt

4. Search for the data you want to use (or download everything)

You can do so by browsing our search page.

5. Retrieve the data from the Filecoin network using Lotus

This step uses Lotus - a tool written in Go to communicate with the Filecoin network. Please follow the instructions to install Lotus. A minimum of a light node is required.
lotus client retrieve --maxPrice <max_in_fil_to_pay> --miner <miner_id> <CID> <absolute-path-to-outfile>
For example;
lotus client retrieve --miner f01392893 --maxPrice 0.01 bafyreihk56cdtms3p7lcdej3zkqmgmlkivx4zkkf3srv4pqoxh7x36nfba ~/ffma.1.tar
Make sure you use an absolute path! Local paths currently are not supported properly.
This will start downloading the files. If all is well, you will see a response like this;
$ lotus client retrieve --maxPrice 0 --miner f01392893 bafyreihk56cdtms3p7lcdej3zkqmgmlkivx4zkkf3srv4pqoxh7x36nfba ~/ffma.1.tar > Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew) > Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance) > Recv: 0 B, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted) > Recv: 0 B, Paid 0 FIL, ClientEventPaymentChannelSkip (DealStatusOngoing) > Recv: 234 B, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 52.24 KiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 1.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 2.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 3.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) ...... > Recv: 3.928 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 3.929 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 3.93 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusOngoing) > Recv: 3.93 GiB, Paid 0 FIL, ClientEventComplete (DealStatusCheckComplete) > Recv: 3.93 GiB, Paid 0 FIL, ClientEventWaitForLastBlocks (DealStatusWaitingForLastBlocks) > Recv: 3.931 GiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitingForLastBlocks) > Recv: 3.931 GiB, Paid 0 FIL, ClientEventAllBlocksReceived (DealStatusFinalizingBlockstore) > Recv: 3.931 GiB, Paid 0 FIL, ClientEventBlockstoreFinalized (DealStatusCompleted) Success
You can verify the checksum by running an md5sum and comparing it to the checksum listed here.
You can also automatically verify it;
echo "checksum filename" | md5sum -c
Example:
$ echo "35ad94da86e49be46cbc2ec55dd0188b ~/ffma.1.tar" | md5sum -c ~/ffma.1.tar: OK

6. Extract your files

tar -xvf ffma.1.tar

7. Add an environment file (.env)

echo "AUDIO_DIR=./data/fma/ # the path to a decompressed tar file" > .env

8. Open Jupyter or run a notebook

jupyter notebook make usage.ipynb

Start searching »