Create MinerU Archive File

Before creating a MinerU archive file, you need to use MinerU to complete the PDF file conversion and locate the MinerU output directory. Visit the MinerU official website to learn how to convert PDF files. Or check the MinerU GitHub repository to learn how to deploy and run MinerU locally on Mac.

Using MinerU Client

Open the output directory from the MinerU client. MinerU open document directory

In the MinerU output directory, select the content_list.json, origin.pdf files, and the images folder. Then right-click and choose Compress 3 Items. MinerU compress

After compression is complete, a compressed file named Archive.zip will be generated. Rename this file to your_filename.mineru to get the MinerU archive file. MinerU rename archive file

Now you can import the MinerU archive file into DoCube.

Local Deployment on Mac

  • For local deployment, it's recommended to use a device with Apple Silicon chip and at least 16GB of memory
  • Local deployment steps are relatively complex. If you encounter issues, you can refer to the official repository or contact DoCube.
  • Due to different Mac environments, terminal styles and outputs may vary. The terminal outputs in the following instructions are for reference only

Environment Setup

Open the `Terminal` application and enter the following command to create a mineru environment:
python3 -m venv mineru
Then activate the mineru environment by entering the following command:
source mineru/bin/activate

Install MinerU

Enter the following command to install pip (you can skip this step if you have already installed pip):
sudo python3 get-pip.py
After successful installation, you will see the following output Successfully installed pip-25.3:

Next are the official MinerU steps:
1. Upgrade pip by running the following command:
pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple

2. Install uv by running the following command:
pip install uv -i https://mirrors.aliyun.com/pypi/simple
After execution, you will see Successfully installed uv-0.9.21

3. Install mineru by running the following command:
uv pip install -U "mineru[all]" -i https://mirrors.aliyun.com/pypi/simple
The installation process is quite long and requires downloading multiple dependency packages.

Run MinerU

After completing the above steps, you can now run MinerU locally to convert PDFs. If you cannot access huggingface in your region, first switch the model download source by running the following command:
export MINERU_MODEL_SOURCE=modelscope
Then run the following command to start converting PDF files:
mineru -p origin_file.pdf -o ./output
Where origin_file.pdf is the path to the PDF file you want to convert, and ./output is the output directory path.

If you are not familiar with `Terminal`:
  • You can first type mineru -p and then drag the PDF file you want to convert into the `Terminal` window. This will auto-complete the file path.
  • The `./output` output directory can be customized. For example, you can specify it as ~/Desktop/mineru_output, which will create a mineru_output directory on your desktop.