Topview Logo
  • Create viral videos with
    GPT-4o + Ads library
    Use GPT-4o to edit video empowered by Youtube & Tiktok & Facebook ads library. Turns your links or media assets into viral videos in one click.
    Try it free
    gpt video

    AI-Powered Data Extraction: Training a Custom Model with AlgoDocs: Processing BOL documents

    blog thumbnail

    Introduction

    In this article, we will discuss how to leverage AI-based data extraction by training a custom model using AlgoDocs, specifically focusing on bill of lading (BOL) documents. For this demonstration, we will use BOL documents derived from a real-world use case, meaning certain sections of the documents will be masked. However, our goal is to capture the visible data. The documents we work with are images taken by mobile devices, which implies various layouts and field positioning.

    Overview

    Our training model will capture various fields including:

    • Bill of Lading Number
    • Origin Terminal Number
    • Pro Number
    • Number of Pallets
    • Ship to City, State, and ZIP Code
    • Prepaid Options (selected or unselected)
    • Items Table (with specific columns: Weight, Commodity Description, and Class)

    Step 1: Creating the Extractor

    We start by creating an extractor, naming it "BOL Extractor." After selecting a sample document to process, we move to the extractor editor. AlgoDocs offers various extraction methods, and we choose the custom model option, which directs us to the AI custom model editor. The sample document we uploaded undergoes processing during this setup.

    Step 2: Labeling Documents

    To train our custom model successfully, it’s essential to label documents accurately. We need at least ten labeled documents before training. The initial document is already uploaded, so we select nine more files, leaving the last two aside for testing the model. It's recommended to label more than ten files for complex cases, as a larger dataset usually yields better accuracy.

    Next, we create fields in the right pane. Each field requires a type designation: field, table, or selection mark. For example, "Prepaid" will classify as a selection mark while "Items" will be a table. We continue to add all necessary fields such as Bill of Lading Number, Origin Terminal Number, Pro Number, Number of Pallets, Ship to City, Ship to State, Ship to ZIP Code, Prepaid, and Items table.

    Step 3: Document Labeling Process

    The labeling process is critical as the AI model learns from accurate annotations. Labeling can be accomplished by clicking on the value of each field or selection mark. For the items table, we can opt for an auto-labeling feature that detects tables automatically.

    Next, we assign the appropriate column names to the detected table while ignoring unnecessary header rows. This ensures that our model captures only item information essential for training. After completing the labeling process for all documents, we proceed to train our model by clicking the "Train" button.

    Step 4: Post-Formatting and Training

    After training initiation, we are taken back to the extractor editor. Here, we can apply formatting adjustments (e.g., removing unwanted commas from city names) while the model undergoes training, which can take up to an hour.

    Step 5: Testing the Custom Model

    Once the training is complete, we test the model with the previously set-aside files. Creating a new folder allows easy uploading of documents for processing. When we upload these files, the captured data appears under the extracted data section.

    Following testing, we can export extracted data in formats such as Excel, JSON, or XML. We also maintain the option to review and correct any inaccuracies before finalizing the data extraction results. Moreover, AlgoDocs integrations can streamline automated file imports and data retrieval processes.

    For any questions or support regarding this process, feel free to contact support at algodo.com.


    Keywords

    • AI-based data extraction
    • Custom model training
    • Bill of lading (BOL) documents
    • Document labeling
    • Extraction methods
    • AlgoDocs
    • Data export formats
    • Integration automation

    FAQ

    1. What is AlgoDocs?
    AlgoDocs is a platform offering advanced data extraction and automation solutions, primarily focusing on various types of documents.

    2. How many documents do I need to train a custom model?
    You need at least ten labeled documents to train the model effectively, although more is recommended for complex cases.

    3. Can I adjust the extracted data after processing?
    Yes, you can review and apply corrections to the extracted data if needed.

    4. What formats can I export the extracted data into?
    The extracted data can be exported in Excel, JSON, or XML formats.

    5. What should I do if I need support while using AlgoDocs?
    You can contact support at algodo.com for assistance with any questions or issues regarding the platform.

    One more thing

    In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.

    TopView.ai provides two powerful tools to help you make ads video in one click.

    Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.

    Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.

    You may also like