Inside PDFAuthor’s note: This is a revised version of my previous blog to make it more international. Author’s note: Link to a Romanian translation of this article (by Alexander Ovsov)Author’s note: Link to a Ukrainian translation of this article (by Yaroslava Levchenko)Data versus Information. As part of open government initiatives, government agencies are creating guidelines and goals for making more of the information held by them more accessible to the public. See for example, the Open data initiative in the UK and the EU directive on Public Sector Information. Of course, we advocate the judicious use of PDF for disseminating that government information.
Inside PDF This blog will discuss all aspects of PDF technology and the PDF standard (ISO 32000) itself. There are a lot of interesting things happening around PDF and a lot of misinformation is being generated. Real drawback to this program is that there is no Help file, but since the interface is so streamlined, it’s not too hard to find your way around. The program is free, it works quickly, and it saves accurate images in the.
- I was wondering how to make a PDF file link downloadable instead of opening them in the browser? How is this done in html?
- PDF Download, free and safe download. PDF Download latest version: Choose how to open PDF files in Firefox. There is nothing that annoys me more in Firefox when a PDF document tries to open and crashes the browser.
- Jump To: Sharing NAP PDF Files . Sharing NAP PDF Files. All of the National Academies Press books (electronic and printed versions) on this website are copyrighted.
- Disabling the browser plugin. You can prevent automatic opening of.PDF documents by disabling the Adobe Acrobat browser plugin. This will allow you choose another action for PDF links, such as opening them in Adobe Reader as.
- Help with PDF documents. Some of the publications on the PHAC web site are available in Adobe . PDF allows for a cross-platform PostScript-based file containing any combination of text.
- Pat, Not displaying a PDF in browser means that the PDF is not opened within the browser window. Said another way, the PDF downloads instead and opens in a separate Adobe Reader window causing extra blank IE Pages to be.
- Hello, I am looking for a very easy way for a user to view a word or pdf document from a webpage. On the webpage, I have user information, specifically, a filepath to a document stored on a file server that when a usert clicks.
Well, in most situations. We use the terms data and information to distinguish at least two ways government agencies are being asked to provide information to the public in electronic form. Data is just the raw numbers, names, places, etc. The stuff one might pull out of a database. And then there is the most extreme view that the medium is the message!
When we go to government websites we usually want information not data. When an analyst goes to a government website she might want just the raw data so that it can be interpreted, shaped and analyzed, and turned into a specific document as information not provided by the government website.
Hammers. There is the saying that if you have a hammer, then everything looks like a nail. And we have to confess that, from our view, almost all needs are best addressed using PDF. PDF is our metaphorical hammer. We have some justification for this position as the most significant PDF software provider in the world.
Government agencies can, and do, use PDF for effective information distribution. Information distributed as a PDF can be downloaded, read in its electronic form, saved for later reference, shared and printed. Everyone has a PDF reader. PDF documents can also be infused with, what we at Adobe like to call, “rich document” features. The final representation of the information can be very important. As noted earlier, the medium is the message, or it certainly can make an important contribution. However, for the person who wants raw data, PDF isn’t the right choice.
See, we are willing to refrain from hitting everything in sight with our PDF hammer! Some XML enthusiasts, but certainly not all, go overboard. We think it hurts their cause.
As I have blogged earlier (XML for . We need to drop our hammers and consider the facts. Using XML for raw data is the kernel of a good idea. But there are some limitations of XML that need to be addressed when considering it as the only option for data distribution: 1. Those who are not familiar with XML, need to realize that XML isn’t a single markup language for a single use, but it is a method for defining and using specialized markup languages. That is why we have to say XML for business cards, XML for invoices, XML for classifying political action committees and so on. There are thousands of such XML markup languages and there will be thousands more to cover all those government datasets where XML is appropriate.
I also have a blog on this topic. Large raw datasets can be prohibitively large when expressed in an XML markup language. Unnecessarily large, from an information theoretic view.
For example, here is an XML data file that can be found on www. Note that, when you download this file, it is a ZIP file whose size is 1. In this case, the EPA personnel know that XML files can be very large and have packaged it in a ZIP for downloading to reduce the transmission time by a multiple of over 1. In other words, if it takes a minute to download the ZIP’ed version, it will take over 1. XML version. After unzipping, the XML file is identical to the original. So any advocacy for XML, should always be accompanied with a discussion on file size and considerations of using something like ZIP.
To do otherwise would be irresponsible. XML files need additional metadata in order to make use of the data that is found within them. If you are given three numbers (0. Unless given additional information, there is an ambiguity between the 7 and the 1.
And this is a trivial example. We need extra information, e. The basic syntactic rules used for a markup language can and should be provided by offering an XML Schema (. We also need to explain the semantics; that usually requires a technical document. There are other raw data formats that might be more suited to particular needs such as standard spreadsheet files (.
Microsoft and Open Office spreadsheet formats (. In addition there may be very application specific files, not in XML format, appropriate for specific needs. For example, shapefiles (. Note: There can be an argument made that compressing/decompressing files is so time consuming that the time lost there is not made up in reduced transmission time. With today’s lighting fast CPU’s the compression and decompression times are relatively minor, but the transmission times can be a problem if you don’t have the latest and greatest Internet connections.
So nearly always, the choice to used compressed data is the right one for data expressed in an XML markup language. PDF versus ZIPIn another Inside. PDF blog (PDF File Attachments), the file attachment features of standard PDF (ISO 3. To summarize, any number of file attachments, in any format, can be embedded into any PDF file. They can be extracted for use by anyone receiving the PDF file.
In addition, when files are attached/embedded into the PDF, they will be compressed using the same compression method that is used in ZIP files: deflate/flate. For the purposes of distributing government data, this is nearly ideal.
The PDF file can carry the . PDF document, itself, can provide all the additional semantic information that would be needed in order to make use of the data — the metadata. If the raw data is in XML form, then a compressed XML Schema file (. PDF document. So when using PDF, the points made above are addressed: file size, necessary metadata to define the XML markup language used, and formats other than XML. Sample PDF envelope containing XML data. We have created a sample PDF envelope starting from this government dataset. Note that both the XML dataset and the associated Schema file are attachments to the PDF that helps to define the XML markup language used for this file.
We took the general introduction from the government web page and made up a brief description for each of the XML elements found in the file. Make sure to use a PDF reader that can display the attachment annotations and that can extract the attachments. Adobe Reader can do that. Programmatic extraction of any data out of PDFPDF attachments can be of any format and can also be organized hierarchically, just as you can with a ZIP file. And like ZIP, there are numerous open source projects devoted to the creation, modification and viewing of PDF. One very popular one is i. Text by Bruno Lowagie and i.
Text Software. They also have an excellent example demonstrating how to use i. Text to create a PDF containing various data files. In addition, it also shows how someone could programmatically extract those contents. Government agencies can send digitally certified PDF files containing data files and their customers can authenticate that the PDF, and all the attachments, came from that agency and have not been tampered with. See the Inside. PDF blog about Authenticated PDF Documents.
Other ways to use PDF attachments for government information delivery. So we described how we use PDF to provide a complete package for raw data downloading. An annotation can be placed on the chart or table that allows the appropriate attachment to be extracted. In a sense, this make PDF editable, something that people have asked for. Quoting from the Open.
Office. org website: “A hybrid PDF/ODF file is a PDF file that contains an embedded ODF source file. Hybrid PDF/ODF files will be opened in Open. Office. org as an ODF file without any layout changes. Users without this extension can open the PDF part of the hybrid file with their PDF viewer.” Adobe Acrobat’s Microsoft Office tools can also create PDF files with the Office file that created them as an attachment.
So hammer away, you PDF enthusiasts.