Custom org-sitemap-function
post-Org 9.1
Published: December 27, 2022
I have been lugging around an old version of org-mode (9.0 to be specific) in the git repo which builds this website for a number of years now. I decided to do this because I had a custom org-sitemap-function
to generate the landing page for my blog, but org 9.1 introduced a breaking change to the org-publish API.
I have now finally come around to fixing this issue and making my website compatible with modern emacs and org-mode versions higher than 9.1. However, porting my old sitemap function was… surprisingly difficult? So just in case someone is looking up my original post these days, this post contains a sitemap-function
which will work in 2022.
In the pre-9.1 version, the org-sitemap-function
would get the project-plist
as argument. Post-9.1, the org manual states:
:sitemap-function
Plug-in function to use for generation of the sitemap. It is called with two arguments: the title of the site-map and a representation of the files and directories involved in the project as a nested list, which can further be transformed using org-list-to-generic, org-list-to-subtree and alike. Default value generates a plain list of links to all files in the project.
This makes it a little more difficult to produce the sitemap page that we want, but I was able to get it done by parsing each element of the list and extracting the path to the filename with the following function:
(defun my-blog-parse-sitemap-list (l) "Convert the sitemap list in to a list of filenames." (mapcar #'(lambda (i) (let ((link (with-temp-buffer (let ((org-inhibit-startup nil)) (insert (car i)) (org-mode) (goto-char (point-min)) (org-element-link-parser))))) (when link (plist-get (cadr link) :path)))) (cdr l)))
Finally, the new and improved sitemap function looks like this:
(defun my-blog-sort-article-list (l p) "sort the article list anti-chronologically." (sort l #'(lambda (a b) (let ((d-a (org-publish-find-date a p)) (d-b (org-publish-find-date b p))) (not (time-less-p d-a d-b)))))) (defun my-blog-sitemap (title list) "Generate the landing page for my blog." (with-temp-buffer ;; mangle the parsed list given to us into a plain lisp list of files (let* ((filenames (my-blog-parse-sitemap-list list)) (project-plist (assoc "blog-articles" org-publish-project-alist)) (articles (my-blog-sort-article-list filenames project-plist))) (dolist (file filenames) (let* ((abspath (file-name-concat my-website-blog-dir file)) (relpath (file-relative-name abspath my-website-base-dir)) (title (org-publish-find-title file project-plist)) (date (format-time-string (car org-time-stamp-formats) (org-publish-find-date file project-plist))) (preview (my-blog-get-preview abspath))) ;; insert a horizontal line before every post, kill the first one ;; before saving (insert "-----\n") (insert (concat "* [[file:" relpath "][" title "]]\n")) ;; add properties for `ox-rss.el' here (let ((rss-permalink (concat (file-name-sans-extension relpath) ".html")) (rss-pubdate date)) (org-set-property "RSS_PERMALINK" rss-permalink) (org-set-property "PUBDATE" rss-pubdate)) ;; insert the date, preview, & read more link (insert (concat "Published: " date "\n\n")) (insert preview) (insert "\n") (insert (concat "[[file:" relpath "][Read More...]]\n")))) ;; kill the first hrule to make this look OK (goto-char (point-min)) (let ((kill-whole-line t)) (kill-line)) ;; insert a title and save (insert "#+OPTIONS: title:nil\n") (insert "#+TITLE: Blog - Dennis Ogbe's Personal Website\n") (insert "#+AUTHOR: Dennis Ogbe\n") (insert "#+EMAIL: [email protected]\n") (buffer-string))))
This info can be combined with the instructions in my original post to cook up your own very special org-mode website.
Update emacs --batch
, i.e., something like:
emacs --batch -l "./project.el" --eval="(org-publish \"blog\" t)"
As part of the build process, I use CSSTidy to minify my CSS and bibtex2html to generate the list of publications.
;; This file defines the org-publish project for my web site. -*- eval: (flycheck-mode -1) -*- ;; I can either run this from the build.sh script / Makefile or ;; evaluate this buffer and publish from within Emacs while editing ;; the page. (defun generate-website (arg) "Generate my website. Call with prefix argument for a complete rebuild." (interactive "P") (message "Generating website for staging...") (if arg ;; force rebuild everything (org-publish "blog" t nil) ;; only rebuild what changed (org-publish "blog" nil nil)) (message "Done. Check output in %s" my-website-out-dir)) ;; I am not sure why I have to do it this way... This snippet finds ;; the parent directory of this file, which is the base directory of ;; the project. (setq my-website-base-dir (file-name-as-directory (file-name-directory (directory-file-name (file-name-directory (or load-file-name buffer-file-name)))))) ;; set up the rest of the directory tree (defmacro my-website-set-path-var (name) (list 'setq (intern (format "my-website-%s-dir" name)) (list 'file-name-as-directory (concat my-website-base-dir name)))) (my-website-set-path-var "bin") (my-website-set-path-var "bib") (my-website-set-path-var "blog") (my-website-set-path-var "css") (my-website-set-path-var "cv") (my-website-set-path-var "dl") (my-website-set-path-var "html") (my-website-set-path-var "img") (my-website-set-path-var "lisp") (my-website-set-path-var "pages") ;; we pull the output directory out of an environment variable. If this ;; variable is not set, we bail (setq my-website-out-dir (getenv "WEBSITE_OUT_DIR")) (unless my-website-out-dir (setq my-website-out-dir (file-name-concat my-website-base-dir "www")) (message "Using default WEBSITE_OUT DIR: %s" my-website-out-dir)) (setq my-website-out-dir (file-name-as-directory my-website-out-dir)) ;; [2022-12-27 Tue] This is now compatible with emacs 28.1. it ;; requires the `htmlize' and `org-contrib' packages. (package-initialize) (require 'org) (require 'htmlize) (require 'org-contrib) (require 'ox-html) (require 'ox-rss) ;; re-build the entire project if $WEBSITE_BUILD_TYPE=FULL (when (and (getenv "WEBSITE_BUILD_TYPE") (string-equal (downcase (getenv "WEBSITE_BUILD_TYPE")) "full")) (setq org-publish-use-timestamps-flag nil)) ;; html export settings (setq org-export-html-coding-system 'utf-8-unix) (setq org-html-htmlize-output-type 'css) ;; massage org-time-stamps (setq org-time-stamp-custom-formats '("%B %d, %Y" . "%A, %B %d %Y, %H:%M")) (defun my-org-export-ensure-custom-times (backend) (setq-local org-display-custom-times t)) (add-hook 'org-export-before-processing-hook 'my-org-export-ensure-custom-times) ;; we do not need backup files for this (setq make-backup-files nil) ;; we evaluate some elisp to generate some html. this lets us do that. (setq org-confirm-babel-evaluate nil) (defun my-blog-extra-head (arg) (concat "<link rel='stylesheet' href='/../res/fonts.css' />\n" ; main css "<link rel='stylesheet' href='/../res/code.css' />\n" ; code highlighting "<link rel='stylesheet' href='/../res/main.css' />\n" ; main css (when arg "<link rel='stylesheet' href='/../res/blog.css' />\n") ; blog style "<link rel='shortcut icon' href='/../img/favicon.ico'>\n" ; favicon "<link rel='alternate' type='application/rss+xml' title='RSS Feed for ogbe.net' href='/blog.xml' />\n")) ;; header and footer (defun my-blog-header (info) (with-temp-buffer (insert-file-contents (concat my-website-html-dir "header.html")) (buffer-string))) (setq my-blog-footer (with-temp-buffer (insert-file-contents (concat my-website-html-dir "footer.html")) (buffer-string))) (defun my-blog-org-export-format-drawer (name content) (concat "<div class=\"drawer " (downcase name) "\">\n" "<h6>" (capitalize name) "</h6>\n" content "\n</div>")) (setq my-blog-local-mathjax '((path "/mathjax/tex-chtml.js") (scale "100") (align "center") (indent "2em") (tagside "right") (autonumber "AMS") (mathml nil))) (setq my-blog-extra-mathjax-config "<script> MathJax = { tex: { inlineMath: [['$', '$'], ['\\\\(', '\\\\)']] }, svg: { fontCache: 'global' } }; </script>") (defun my-blog-get-preview (file) "The comments in FILE have to be on their own lines, prefereably before and after paragraphs." (with-temp-buffer (insert-file-contents file) (goto-char (point-min)) (let ((beg (+ 1 (re-search-forward "^#\\+BEGIN_PREVIEW$"))) (end (progn (re-search-forward "^#\\+END_PREVIEW$") (match-beginning 0)))) (buffer-substring beg end)))) (defun my-blog-parse-sitemap-list (l) "Convert the sitemap list in to a list of filenames." (mapcar #'(lambda (i) (let ((link (with-temp-buffer (let ((org-inhibit-startup nil)) (insert (car i)) (org-mode) (goto-char (point-min)) (org-element-link-parser))))) (when link (plist-get (cadr link) :path)))) (cdr l))) (defun my-blog-sort-article-list (l p) "sort the article list anti-chronologically." (sort l #'(lambda (a b) (let ((d-a (org-publish-find-date a p)) (d-b (org-publish-find-date b p))) (not (time-less-p d-a d-b)))))) (defun my-blog-sitemap (title list) "Generate the landing page for my blog." (with-temp-buffer ;; mangle the parsed list given to us into a plain lisp list of files (let* ((filenames (my-blog-parse-sitemap-list list)) (project-plist (assoc "blog-articles" org-publish-project-alist)) (articles (my-blog-sort-article-list filenames project-plist))) (dolist (file filenames) (let* ((abspath (file-name-concat my-website-blog-dir file)) (relpath (file-relative-name abspath my-website-base-dir)) (title (org-publish-find-title file project-plist)) (date (format-time-string (car org-time-stamp-formats) (org-publish-find-date file project-plist))) (preview (my-blog-get-preview abspath))) ;; insert a horizontal line before every post, kill the first one ;; before saving (insert "-----\n") (insert (concat "* [[file:" relpath "][" title "]]\n")) ;; add properties for `ox-rss.el' here (let ((rss-permalink (concat (file-name-sans-extension relpath) ".html")) (rss-pubdate date)) (org-set-property "RSS_PERMALINK" rss-permalink) (org-set-property "PUBDATE" rss-pubdate)) ;; insert the date, preview, & read more link (insert (concat "/Published: " date "/\n\n")) (insert preview) (insert "\n") (insert (concat "[[file:" relpath "][/Read More.../]]\n")))) ;; kill the first hrule to make this look OK (goto-char (point-min)) (let ((kill-whole-line t)) (kill-line)) ;; insert a title and save (insert "#+OPTIONS: title:nil\n") (insert "#+TITLE: Blog - Dennis Ogbe's Personal Website\n") (insert "#+AUTHOR: Dennis Ogbe\n") (insert "#+EMAIL: [email protected]\n\n") (insert "@@html:<h1>Blog</h1>@@\n\n") ; this way the browser's tab shows ^ but the site shows < (buffer-string)))) ;; pre- and post-processing (defun my-blog-pages-preprocessor (project-plist) (message "In the pages preprocessor.")) (defun my-blog-pages-postprocessor (project-plist) (message "In the pages postprocessor.")) (defun my-blog-articles-preprocessor (project-plist) (message "In the articles preprocessor.")) (defun my-blog-articles-postprocessor (project-plist) "Massage the sitemap file and move it up one directory. for this to work, we have already fixed the creation of the relative link in the sitemap-publish function" (let* ((sitemap-fn (concat (file-name-sans-extension (plist-get project-plist :sitemap-filename)) ".html")) (sitemap-olddir (plist-get project-plist :publishing-directory)) (sitemap-newdir (expand-file-name (concat (file-name-as-directory sitemap-olddir) ".."))) (sitemap-oldfile (expand-file-name sitemap-fn sitemap-olddir)) (sitemap-newfile (expand-file-name (concat (file-name-as-directory sitemap-newdir) sitemap-fn)))) (with-temp-buffer (goto-char (point-min)) (insert-file-contents sitemap-oldfile) ;; massage the sitemap if wanted ;; delete the old file and write the correct one (delete-file sitemap-oldfile) (write-file sitemap-newfile)))) (defun my-blog-articles-add-subheader (plist filename pub-dir) "Called after the publishing function, this adds a subheader to each blog post." (let* ((outfile (file-name-concat pub-dir (concat (file-name-base filename) ".html"))) (date (format-time-string (car org-time-stamp-custom-formats) (org-publish-find-date filename plist))) (author (org-publish-find-property filename 'author plist)) ; unused (re (regexp-quote "<h1 class=\"title\">"))) ;; open the outfile and splice publishing date into the generated HTML (with-temp-buffer (insert-file-contents outfile) (when (re-search-forward re nil t) (end-of-line) (insert (format "\n<div class=\"subheader\"><p><i>Published: %s</i></p></div>" date))) (write-file outfile)))) (defun my-blog-minify-css (project-plist) "Minify most of the CSS using CSSTidy." (let* ((csstidy (concat my-website-bin-dir "csstidy")) (csstidy-args " --template=highest --silent=true") (css-dir (expand-file-name (plist-get project-plist :publishing-directory))) (css-files (directory-files css-dir t "^.*\\.css$"))) ; CSSTidy does not work with the fonts file (dolist (file css-files) (unless (string-match-p (regexp-quote "fonts.css") file) (with-temp-buffer (insert (shell-command-to-string (concat csstidy " " file csstidy-args))) (write-file file)))))) ;; emacs black magic. This code uses the bib2html binary to generate a list of ;; publications from my bibtex file. On the publications page, the output ;; appears as a table. (defun generate-bib-html (relfile) (let ((bib2html-binary (concat my-website-bin-dir "bibtex2html")) (infile (concat my-website-bib-dir relfile)) (tempfile (make-temp-file "emacs-bib2html"))) ;; run bib2html with the correct flags (call-process bib2html-binary nil nil nil "-noheader" "-nofooter" "-nodoc" "-s" "ieeetr" "-d" "-r" "-nobiblinks" "-nolinks" "-unicode" "-o" tempfile infile) ;; massage the output (with-temp-buffer (insert-file-contents (concat tempfile ".html")) ;; make the table left-aligned (goto-char (point-min)) (replace-regexp "<table>" "<table style=\"margin: 0 0 0 0; max-width:100%\">") ;; highlight my name (FIXME might be better ways, but for now this works.) (goto-char (point-min)) (replace-regexp "D. Ogbe" "<b>D. Ogbe</b>") (buffer-substring (point-min) (point-max))))) ;; finally, pull the project together in the `org-publish-project-alist' (setq org-publish-project-alist `(("blog" :components ("blog-articles" "blog-pages" "blog-rss" "blog-css" "blog-images" "blog-dl")) ("blog-articles" :base-directory ,my-website-blog-dir :base-extension "org" :publishing-directory ,(concat my-website-out-dir "blog") :publishing-function (org-html-publish-to-html my-blog-articles-add-subheader) :preparation-function my-blog-articles-preprocessor :completion-function my-blog-articles-postprocessor :htmlized-source t ;; this enables htmlize, which means that I can use css for code! ;; n.b., these actually don't do anything because org mode ;; puts the information in the header, but I am overwriting ;; the header. leaving here anyway. :with-author t :with-creator nil :with-date t :with-timestamps nil :headline-level 4 :section-numbers nil :with-toc nil :with-drawers t :with-sub-superscript nil ;; important!! ;; the following removes extra headers from HTML output -- important! :html-link-home "/" :html-head nil ;; cleans up anything that would have been in there. :html-head-extra ,(my-blog-extra-head t) :html-head-include-default-style nil :html-head-include-scripts nil :html-format-drawer-function my-blog-org-export-format-drawer :html-home/up-format "" :html-mathjax-options ,my-blog-local-mathjax :html-mathjax-template ,(concat my-blog-extra-mathjax-config "<script type=\"text/javascript\" src=\"%PATH\"></script>") :html-footnotes-section "<div id='footnotes'><!--%s-->%s</div>" :html-link-up "" :html-link-home "" :html-preamble my-blog-header :html-postamble ,my-blog-footer ;; sitemap - list of blog articles :auto-sitemap t :sitemap-filename "blog.org" :sitemap-title "Blog" ;; custom sitemap generator function :sitemap-function my-blog-sitemap :sitemap-function org-publish-sitemap-default :sitemap-sort-files anti-chronologically :sitemap-date-format "Published: %a %b %d %Y") ("blog-pages" :base-directory ,my-website-pages-dir :base-extension "org" :publishing-directory ,my-website-out-dir :publishing-function org-html-publish-to-html :preparation-function my-blog-pages-preprocessor :completion-function my-blog-pages-postprocessor :htmlized-source t :with-author t :with-creator nil :with-date t :with-title nil :with-timestamps nil :headline-level 4 :section-numbers nil :with-toc nil :with-drawers t :with-sub-superscript nil ;; important!! ;; the following removes extra headers from HTML output -- important! :html-link-home "/" :html-head nil ;; cleans up anything that would have been in there. :html-head-extra ,(my-blog-extra-head nil) :html-head-include-default-style nil :html-head-include-scripts nil :html-format-drawer-function my-blog-org-export-format-drawer :html-home/up-format "" :html-mathjax-options ,my-blog-local-mathjax :html-mathjax-template ,(concat my-blog-extra-mathjax-config "<script type=\"text/javascript\" src=\"%PATH\"></script>") :html-footnotes-section "<div id='footnotes'><!--%s-->%s</div>" :html-link-up "" :html-link-home "" :html-preamble my-blog-header :html-postamble ,my-blog-footer) ("blog-rss" :base-directory ,my-website-blog-dir :base-extension "org" :publishing-directory ,my-website-out-dir :publishing-function org-rss-publish-to-rss :with-timestamps nil :html-link-home "https://ogbe.net/" :html-link-use-abs-url t :title "Dennis Ogbe" :rss-image-url "https://ogbe.net/img/feed-icon-28x28.png" :section-numbers nil :exclude ".*" :include ("blog.org") :table-of-contents nil) ("blog-css" :base-directory ,my-website-css-dir :base-extension ".*" :publishing-directory ,(concat my-website-out-dir "res") :publishing-function org-publish-attachment :completion-function my-blog-minify-css :recursive t) ("blog-images" :base-directory ,my-website-img-dir :base-extension ".*" :publishing-directory ,(concat my-website-out-dir "img") :publishing-function org-publish-attachment :recursive t) ("blog-dl" :base-directory ,my-website-dl-dir :base-extension ".*" :publishing-directory ,(concat my-website-out-dir "dl") :publishing-function org-publish-attachment :recursive t)))
Update here for an explanation):
: I changed a few minor things, including that I now finally add the publish date into the actual published blog post. Also, thanks to a hint by G.M., who provided a fix for my old structure-template definition, I now have an updated structure template to generate the header for a blog post (see(require 'org-tempo) (tempo-define-template "blog-header" ; just some name for the template '("#+title: ?" n "#+AUTHOR: Dennis Ogbe" n "#+EMAIL: [email protected]" n "#+DATE:" n "#+STARTUP: showall" n "#+STARTUP: inlineimages" n "#+BEGIN_PREVIEW" n p n "#+END_PREVIEW") "<b" "Insert blog header" ; documentation 'org-tempo-tags)