Processing our files
December 23, 2016
Archives contains a wealth of documents and media files provided by local municipalities and public resources. Our goal is to demonstrate the benefits of thoughtfully processing and sharing government records.
A project like this takes a lot of work and planning. If you want to begin a similar effort of your own, please visit Where Do You Begin?
Here’s how we built Archives:
Stored in the cloud
Archives uses a combination of cloud-based file storage services, Dropbox and Google Drive, to store our digital content. Each service is password protected, easy to use, and considered an industry standard. We use WordPress to design our site and specialized WordPress Dropbox / Google Drive plug-ins to provide our files here.
Our text-based documents (e.g., agendas, minutes) are renamed using our naming scheme, converted to PDF format, and converted to editable/searchable text using OCR.
Audio recordings are converted to MP3 format and videos are converted to MP4 (aka M4A, M4P). Our goal is to use popular file formats that will permit you to access our content using any desktop, laptop, or smart device.
Optical Character Recognition (OCR)
When scanning a document, it is typically stored as a PDF file. This is a good start, but more is required here. If not configured properly, the resulting PDF will contain an IMAGE of your document. This means that the scanned document is actually a picture, not a word processing document.
When scanning, you need to make sure that your capture software is configured for Optical Character Recognition (OCR). This will convert your document into text that can be copied, edited, and searched.
If your documents were not processed using OCR, you will need to purchase a more sophisticated program, such as Adobe Acrobat, to process your document. The benefits make it worthwhile.
Naming our files and folders
The file and folder naming scheme used throughout Archives was developed by Fred Litt, Family Technology Associates, after reviewing thousands of hard-copy and digital municipal records. The goal of the naming scheme is to make each file independently identifiable.
We begin each file/folder with the name of the municipality (e.g., Allendale, HoHoKus). For dates, we use the format yyyy-mm-dd. This is supported by ISO 8601.
Here are some of the abbreviations used in our descriptions:
- MC – Mayor and Council
- PB – Planning Board
- BA – Board of Adjustment
- BH – Board of Health
- LB – Library Board
- HP – Historic Preservation
- AG – Agenda
- MN – Minutes
- AU – Audio
- VD – Video
- WQ – Water Quality
- CS – Closed Session
- SP – Special Meeting
- RS – Regular Session
- WS – Work Session
If required, each file/folder description can end with a short explanation of the file contents. This is helpful when naming files that contain important events.
We make active use of SPACES to separate naming components. These are added to aid readability and sorting.
How our processed files are organized
The PDF and media files are collected into MAJOR FOLDERS (e.g., Allendale Mayor and Council).
Let’s select the Allendale Mayor and Council. Folders are separated into sub-categories.
Let’s selected Allendale MC Minutes (MN). The files are separated into years.
Finally, we select the Allendale 2016 MC MN folder. You will find easily accessible and identifiable digital files. Here are some of the files for 2016.
The file displayed below is named Allendale 2016-10-13 MC MN WS.pdf
Our naming scheme identifies this file as
- Allendale from the Borough of Allendale
- 2016-10-13 the event discussed in the document is dated October 13, 2016
- MC Mayor and Council
- MN minutes
- WS Work Session
For the documents that contain text or images, we have applied the following processing. Each file is:
- Named using our file naming scheme
- Converted into PDF (text) or JPG (photo) format for universal access
- Processed using Optical Character Recognition (OCR) to permit text copying and searching
- For the media files containing audio or video, we have applied the following processing
For audio and video files, we have applied the following processing. Each file is:
- Named using our file naming scheme
- Stored in an audio format (e.g., MP3) or video format (e.g., MP4, M4A) that permits easy access via any PC or smart device.
Printing and Downloading
- Once displayed, each PDF can be printed using respective application printing commands.
- Each file can be downloaded to your PC
- Once file appears, RIGHT-CLICK to display document options. Depending on the PC you are using, click on an option such as SAVE AS or SAVE TARGET AS.
- You will then be prompted for a location to store the file.
Once downloaded, word or phrase searching can be performed on both file description and file content. File content searching is available when using a personal computer (Apple or Widows) with a PDF reader such as Adobe Reader.
Most smart devices have apps that will permit searching on file descriptions.
Once a PDF file has been downloaded onto your PC (Windows or Apple), you can search file content – this will require Adobe Reader or another PDF viewing application.
- Open the PDF file
- Click CTRL+F and a text box will appear in upper right
- Enter word or phrase to be located
- If you store multiple files in a folder on your PC, your PC (Apple or Windows) should have the ability to perform a search across all of the documents in the folder.