Publitz: An Approach to Publishing

by
Jason Kantz
Sep 2018

Publitz is a document format and publishing system that addresses some of the annoyances I have with writing and reading documents that are published and served on the web. The parts of publitz that I want to build out include

  • plaintext "markaround" language
  • HTML format designed for "published" documents
  • APIs for interacting with documents, the writers, and other readers

Some design goals for publitz

1. Documents are self contained.

Once you have the publitz HTML file, no other network requests are required to view the document. This means there's only a single download of the HTML file, and it works offline. The contents of the document can also be identified by a single SHA-1 hash and verified with a SHA-2 hash. For example:

$ publitz verify mydoc.htm
c431cd262a5101db09928df23d6ff3f55ca7927a mydoc.htm OK

2. Pages are numbered.

The publishing step breaks the text into pages. An anchor link for the page can be added when viewing the HTML file. This allows specific pages to be cited within a long document.

For example:

https://kantz.com/htm/watz.htm#p3

3. On-screen pages match pages in the printed document.

This means a citation remains valid regardless of the way the document is presented: online, physical printed document, print to PDF, etc.

4. Publitz documents have a particular look and structure.

Similar to how APA is a format for the social sciences or MLA is a format for the humanities, I'd be delighted if publitz settled into being a format for generalists who want the stuff they write to be read and cited by other generalists. Publitz should work well for publishing both fiction and non-fiction. Publitz is not for people who want to make money by putting dynamic ads in their documents.

Publitz is all about the writing, and with the consistency of the format, the particular details of it should fade into the background.

The implications are: a) The preferred publishing format is publitz_htm in order to be most widely accessible. b) The documents are confined to a specific CSS that uses a widely available font for consistent appearance and text wrapping. This implies a font that is open, and metric-compatible with Times New Roman, like Liberation Serif. c) Publitz needs conventions for citing sources, for formatting "front matter", and for being "readable" on the screen.

5. The online reading experience approximates a book.

a) A visual page indicator at the top of the screen gives a visual estimate of how big the document is, similar to how the thickness of a book indicates its length. b) The page indicator allows clicking between pages and also makes it easy to binary search through a document. c) The "print mode" displays the entire document on the screen at once for printing. Print mode also makes it easier to search for text across the whole document using the browser's built in search tool.

6. Only URLs are hyperlinked

This is possibly only a personal annoyance, but a lot of authors sprinkle links throughout a document and don't include a "References" section. This assumes the links will work

forever, and sidesteps the responsibility of collecting, and thinking about, and giving respect to sources being used. I find inline links useful for reference material where I'm seeking a specific bit of information and don't care to read the whole document.

So my current thinking is that publitz isn't for reference material, it's for documents that are intended to be read in their entirety, and therefore only full URIs are hyperlinked in the published version.

7. Plaintext format is "markaround" instead of markdown (Gruber, 2004).

The plaintext has single line commands that change the mode of the document, so for example, a centered paragraph followed by two normal paragraphs looks like:

 #:p.center
 This is centered

 #:p
 This is not centered.

 And this is a third paragraph that is also not centered
 because the mode is stateful.

The document easily converts to plaintext with

 grep -v -e "^#"
 

6. Distribution and reader/writer interaction

This is the area that's still under consideration. Here are some initial ideas.

  • Authors publish their document by uploading it to a service that keys the document by the inner document hash.
  • Since the file is self-contained, authors might digitally sign the hash.
  • The service allows for revisions, one hash/document might replace another
  • How to lead to latest document revision/edition?
  • References section: cite publitz documents with a URI like ptz://c431cd2...
  • Structured references makes it easy to build out "most cited" metrics
  • Maybe implement this as a dApp and uploads go to swarm.

If you've read this far and have suggestions or want to collaborate, please reach out via email: mailto:jason@kantz.com

References

American Psychological Association. (2009). Publication manual of the APA, 6 ed.

Eastlake, D., Hansen, T. (2011) US secure hash algorithms. Internet Engineering Task Force Request for Comments 6234. AT&T Labs: May 2011. Retrieved from https://ietf.org/rfc/rfc6234.txt

Faulkner, S. Eicholz, A., Leithead, T., Danilo, A., Moon, S. (2017). HTML 5.2 W3C Recommendation. Retrieved from https://www.w3.org/TR/2017/REC-html52-20171214/

Gruber, J. (2004). Markdown [computer software]. Philadelphia, PA. Available from https://daringfireball.net/projects/markdown/

The Modern Language Association of America. (2016). MLA handbook, 8 ed.

Trón, V., Elad, Destinatis, Aron. (2018). Welcome to the Swarm Documentation! Retrieved from https://swarm-guide.readthedocs.io/en/latest/index.html

Webbink, M. (2007) Liberation Fonts. Red Hat Blog. Retrieved from https://www.redhat.com/en/blog/liberation-fonts