So one of my pdfs has a page number and a link at the bottom of every page. It’s around 500 pages so I dont want to edit it manually. Is there any way I can delete those things all at once from all pages of the pdf?

Maybe ghost script or python script can do this?

I also notice there isn’t a PDF community in Lemmy, maybe somebody should create one.

Thanks a lot in advance.

  • thevoidzero@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    2 days ago

    I don’t know how comfortable you are writing your own, but pdf saves the components with coordinates, bounding box etc so you should be able to automate it with a small script that reads pdf components directly.

    Also try qpdf to convert pdf into qdf format, then you can open it in a text editor, find the element you want to remove. Look at examples of few pages, find the pattern and do regex replace. Make sure to keep a copy and check the diff before accepting it.