.. role:: hidden :class: hidden-section Getting Started =============== .. code-block:: python from deepparse.parser import AddressParser from deepparse.dataset_container import CSVDatasetContainer address_parser = AddressParser(model_type="bpemb", device=0) # you can parse one address parsed_address = address_parser("350 rue des Lilas Ouest Québec Québec G1L 1B6") # or multiple addresses parsed_address = address_parser(["350 rue des Lilas Ouest Québec Québec G1L 1B6", "350 rue des Lilas Ouest Québec Québec G1L 1B6"]) # or multinational addresses # Canada, US, Germany, UK and South Korea parsed_address = address_parser( ["350 rue des Lilas Ouest Québec Québec G1L 1B6", "777 Brockton Avenue, Abington MA 2351", "Ansgarstr. 4, Wallenhorst, 49134", "221 B Baker Street", "서울특별시 종로구 사직로3길 23"]) # you can also get the probability of the predicted tags parsed_address = address_parser("350 rue des Lilas Ouest Québec Québec G1L 1B6", with_prob=True) # Print the parsed address print(parsed_address) # or using one of our dataset container addresses_to_parse = CSVDatasetContainer("./a_path.csv", column_names=["address_column_name"], is_training_container=False) address_parser(addresses_to_parse) The default predictions tags are the following - ``"StreetNumber"``: for the street number, - ``"StreetName"``: for the name of the street, - ``"Unit"``: for the unit (such as apartment), - ``"Municipality"``: for the municipality, - ``"Province"``: for the province or local region, - ``"PostalCode"``: for the postal code, - ``"Orientation"``: for the street orientation (e.g. west, east), - ``"GeneralDelivery"``: for other delivery information. Parse Addresses From the Command Line ************************************* You can also use our cli to parse addresses using: .. code-block:: sh parse Parse Addresses Using Your Own Retrained Model ********************************************** See `here `__ for a complete example. .. code-block:: python address_parser = AddressParser( model_type="bpemb", device=0, path_to_retrained_model="path/to/retrained/bpemb/model.p") address_parser("350 rue des Lilas Ouest Québec Québec G1L 1B6") Retrain a Model *************** See `here `__ for a complete example using Pickle and `here `__ for a complete example using CSV. .. code-block:: python address_parser.retrain(training_container, train_ratio=0.8, epochs=5, batch_size=8) One can also freeze some layers to speed up the training using the ``layers_to_freeze`` parameter. .. code-block:: python address_parser.retrain(training_container, train_ratio=0.8, epochs=5, batch_size=8, layers_to_freeze="seq2seq") Or you can also give a specific name to the retrained model. This name will be use as the model name (for print and class name) when reloading it. .. code-block:: python address_parser.retrain(training_container, train_ratio=0.8, epochs=5, batch_size=8, name_of_the_retrain_parser="MyNewParser") Retrain a Model With an Attention Mechanism ******************************************* See `here `__ for a complete example. .. code-block:: python # We will retrain the fasttext version of our pretrained model. address_parser = AddressParser(model_type="fasttext", device=0, attention_mechanism=True) address_parser.retrain(training_container, train_ratio=0.8, epochs=5, batch_size=8) Retrain a Model With New Tags ***************************** See `here `__ for a complete example. .. code-block:: python address_components = {"ATag":0, "AnotherTag": 1, "EOS": 2} address_parser.retrain(training_container, train_ratio=0.8, epochs=1, batch_size=128, prediction_tags=address_components) Retrain a Seq2Seq Model From Scratch ************************************ See `here `__ for a complete example. .. code-block:: python seq2seq_params = {"encoder_hidden_size": 512, "decoder_hidden_size": 512} address_parser.retrain(training_container, train_ratio=0.8, epochs=1, batch_size=128, seq2seq_params=seq2seq_params) Download Our Models ******************* Deepparse handles model downloads when you use it, but you can also pre-download our model. Here are the URLs to download our pretrained models directly - `FastText `__, - `FastTextAttention `__, - `BPEmb `__, - `BPEmbAttention `__, - `FastText Light `__ (using `Magnitude Light `__),. Or you can use our CLI to download our pretrained models directly using: .. code-block:: sh download_model