Showing 7 Result(s)

Project – PDF Data Extraction

Autorité des Marchés Financiers (AMF) is the French financial markets regulator, responsible for ensuring the protection of investors and the proper functioning of financial markets.

The purpose of this project is to extract information from several PDF documents. Here we used 12 PDF files containing AMF test practice exams, with 120 questions per file. Once the extraction – of these questions and answers – is done, the formatting of this information in a DataFrame will allow us to use this data in a further training. By going through different intermediate steps (data extraction, data formatting in a table, data filtering by keywords, filtering of successful questions), we are able to train on the AMF questions as many times as necessary thanks to a function asking questions randomly.