Splitting PDF files

So, I work at a school, and the people here want to keep electronic versions of the student reports. These are produced either by access or crystal reports, and they create a massive PDF by printing all the reports at once. The issues is how to split this report automatically, since each report can vary in the number of pages it has. I want to be able to have it cut on pages containing a certain word or phrase. Is that possible?

We’ve been using a program called PDFsam to do basic per-page splitting, and also have access to Acrobat Professional 8 and Livecycle 8.

Not without custom programming.

Can they not generate the reports individually?

I’m not that familiar with databases/CR, but I’ve not seen any way to make them spit out different records as different documents so far.

I have a similar thing, where I have to split a large PDF (also the result of a crystal report, oddly enough) into numerous two page documents, each named something like customer_id.pdf. I use pdftk to burst the original pdf into pages and then reassemble, and pdftotext to get the information from them to name them correctly. I’m lucky that my output documents are always 2 pages long, but if you have a common element on each page of each student’s report that you can look for, you should be able to come up with some logic to reassemble documents with a variable number of pages.

Yeah, the mid-semester reports are a fixed number of pages, so they are fine. But despite threats and cursing, teachers want to keep extending their sections of the reports for end-semester, and so with different subjects, you will get different page lengths. Personally, I think the best solution is to force the report to always be the same number of pages, but I’ve been asked to look into this.

I looked into this for a buddy of mine and could not find anything other than manual or by page number. Even then, it wasn’t reliable.

Probably a bit of a stretch, but I seem to recall that you can tell Access to generate a report and write back to a field in the query that populates it. The query could pull the top record from the overall list, and exclude records that have been updated as “printed” by the report. Macro to run the report | print to file until the query is empty?

Not sure how best to deal with naming and saving each file, though.

Also, it is entirely possible I am misremembering what Access is capable of.


If you’ve got a string you can split on, you should be able to cobble together a script to do this.