commit 91c6cf410237ba846ea1aa079ee2a4ebe14a38f0
parent 8261f49d9cdb15d18adcfa231afea6d2e861440d
Author: Anders Damsgaard <anders@adamsgaard.dk>
Date: Fri, 15 Nov 2019 15:13:46 +0100
First completed draft of scholarref blog entry
Diffstat:
2 files changed, 231 insertions(+), 0 deletions(-)
diff --git a/pages/002-scholarref.cfg b/pages/002-scholarref.cfg
@@ -0,0 +1,8 @@
+filename=scholarref.html
+title=The scholarref tools: never deal with journal webpages again
+description=Fetch publications and bibliography references from the command line
+id=scholarref
+tags=scholarref, references, academia, bibtex, latex
+created=2019-11-15
+updated=2019-11-15
+#index=0
diff --git a/pages/002-scholarref.html b/pages/002-scholarref.html
@@ -0,0 +1,223 @@
+<h2>Rationale</h2>
+<p>During the writing phase of an academic paper, common
+tasks include downloading PDFs of publications and getting
+their references into your bibliography. <a
+href="https://en.wikipedia.org/wiki/Digital_object_identifier">DOIs</a>
+are central to modern reference management, and many journals require that
+all cited references are listed with their DOI in the bibliography. I am
+not a fan of navigating the bloated and distracting webpages of academic
+journals and publication aggregators. Their pages are often slow and
+hard to navigate. Often, clicking the "Download PDF" link redirects the
+viewer into an unusable in-browser PDF viewer instead of providing the
+PDF directly. Journal webpages provide links to export a citation for the
+relevant publication, but these are inconsistent in style and content.</p>
+
+<p>For these reasons, I constructed a set of shell tools
+that allow me to perform most of the tasks without having to
+open a browser. As post title indicates, the goal of the <a
+href="https://src.adamsgaard.dk/scholarref">scholarref</a> tools
+is to provide as much functionality a person might need during as
+possible from a set of command-line utilities. The tools are under
+<a href="https://src.adamsgaard.dk/scholarref/log.html">continuous
+development</a>, and at present I avoid roughly 90% of visits to journal
+webpages.</p>
+
+<p>The <strong>scholarref</strong> design goals are the following:</p>
+<ul>
+<li>POSIX shell scripts with minimal external dependencies:
+Ensures maximum flexibility and portability.</li>
+<li>Aim for simplicity:
+Fewer lines of code make the programs easier to understand, maintain,
+and debug.</li>
+<li>Each tool should do one thing, and do it well:
+Let the users piece the components together to fit their workflow.</li>
+<li>Return references in BibTeX format</li>
+</ul>
+
+<p><strong>DISCLAIMER:</strong> The functionality provided by these
+programs depends on communication with third party webpages, which may
+or may not be permitted by law and the terms of service upheld by the
+third parties. What is demonstrated here is an example only. Use of
+the tools is entirely your own responsibility.</p>
+
+
+<h2>Installation</h2>
+
+<pre><code>$ git clone git://src.adamsgaard.dk/scholarref
+$ cd scholarref
+# make install</pre></code>
+
+<p>The <strong>make install</strong> command may require superuser
+priviledges to install the tools to <strong>/usr/local</strong>. Prefix
+with <strong>doas</strong> or <strong>sudo</strong>, whatever is
+appropriate for the target system.</p>
+
+<h2>The scholarref toolset</h2>
+
+<p>The tools adhere to certain standards, namely POSIX compatibility
+which ensures that the code is portable. The tools use standard UNIX
+concepts of streams, which makes them modular and compatible with other
+text-manipulating tools. Furthermore, I strive to keep the code as simple
+and minimal as possible. Fewer lines of code means fever bugs.</p>
+
+<p>All programs accept input as command-line arguments or from standard
+input (stdin). The programs come with several OPTIONS, and
+it is encouraged to explore the program help text (option:
+<strong>-h</strong>). The <strong>-t</strong> option may be of
+particular interest, since it tunnels all communication through <a
+href="https://torproject.org">Tor</a> via <strong>torsocks</strong>,
+if available on the system.</p>
+
+<h3>getdoi</h3>
+This tool accepts names of PDF files or arbitrary search queries.
+If a PDF file name is supplied, <strong>getdoi</strong> scans the PDF
+text in order to find the first occuring DOI entry, which typically is
+the DOI of the publication itself. If an arbitrary query is supplied,
+the <a href="http://api.crossref.org">CrossRef API</a> to find the
+closest matching publication and its DOI. You can supply author names,
+parts of the title, ORCID, journal name, etc. Examples:</p>
+
+<pre><code>$ getdoi damsgaard2018.pdf
+10.1029/2018ms001299
+$ getdoi 'damsgaard sergienko adcroft journal advances modeling earth systems'
+10.1029/2018ms001299
+</code></pre>
+
+<h3>getref</h3>
+<p>The <strong>getref</strong> tool fetches the BibTeX citation for a
+given DOI from <a href="https://doi.org">doi.org</a>. By default,
+the journal names and author first names are abbreviated, which is
+what most journals want. I have taken most abbreviations from the <a
+href="https://www.library.caltech.edu/journal-title-abbreviations">Caltech
+Library list of Journal Title Abbreviations</a>. My journal-title
+abbreviation ruleset is incomplete, and is expanded on a per-need
+basis. The abbreviation functionality can be disabled, see <strong>getref
+-h</strong> for details.</p>
+
+<pre><code>$ getref 10.1029/2018ms001299
+@article{Damsgaard2018,
+ doi = {10.1029/2018ms001299},
+ year = 2018,
+ publisher = {American Geophysical Union ({AGU})},
+ volume = {10},
+ number = {9},
+ pages = {2228--2244},
+ author = {A. Damsgaard and A. Adcroft and O. Sergienko},
+ title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics},
+ journal = {J. Adv. Mod. Earth Sys.}
+}
+$ getref -j 10.1029/2018ms001299 # do not abbreviate journal title
+@article{Damsgaard2018,
+ doi = {10.1029/2018ms001299},
+ year = 2018,
+ publisher = {American Geophysical Union ({AGU})},
+ volume = {10},
+ number = {9},
+ pages = {2228--2244},
+ author = {A. Damsgaard and A. Adcroft and O. Sergienko},
+ title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics},
+ journal = {Journal of Advances in Modeling Earth Systems}
+}
+</code></pre>
+
+<h3>shdl</h3>
+<p>This tool takes a DOI as input and attempts to
+download the corresponding publication as a PDF through <a
+href="https://sci-hub.tw">sci-hub</a>. Unfortunately, the sci-hub web
+interface often puts up captias to restrict automated downloads. If that's
+the case, <strong>shdl</strong> opens the tor browser (if installed)
+or the system web browser in order to manually complete the
+download. Output PDF files are saved in the present working directory.</p>
+
+
+<h2>Usage examples</h2>
+
+<p>If you want a BibTeX reference a search query, simply use UNIX
+pipes to send the <strong>getdoi</strong> output as input for
+<strong>getref</strong>:</p>
+
+<pre><code>$ getdoi 'damsgaard egholm ice flow dynamics' | getref
+@article{Damsgaard2016,
+ doi = {10.1002/2016gl071579},
+ year = 2016,
+ publisher = {American Geophysical Union ({AGU})},
+ volume = {43},
+ number = {23},
+ pages = {12,165--12,173},
+ author = {A. Damsgaard and D. L. Egholm and L. H. Beem and S. Tulaczyk and N. K. Larsen and J. A. Piotrowski and M. R. Siegfried},
+ title = {Ice flow dynamics forced by water pressure variations in subglacial granular beds},
+ journal = {Geophys. Res. Lett.}
+}
+</code></pre>
+
+<p>The <strong>scholarref</strong> program itself is an aggregation of
+the <strong>getdoi</strong> and <strong>getref</strong> commands. If
+called with the <strong>-a</strong> option, the reference
+is directly inserted into the system bibliography. The full
+path to the bibliography file (.bib) is assumed to be set in the
+<strong>$BIB</strong> environment variable, for instance defined in the
+user <strong>~/.profile</strong>.</p>
+
+<pre><code>$ echo $BIB
+/home/ad/articles/own/BIBnew.bib
+$ scholarref -a 'damsgaard egholm ice flow dynamics'
+Citation Damsgaard2016 added to /home/ad/articles/own/BIBnew.bib
+</code></pre>
+
+
+<h2>Integrating into your favorite $EDITOR</h2>
+<p>The <strong>scholarref</strong> tool is particularly useful if called
+from within a text editor.</p>
+
+<h3>vi</h3>
+<p>My editor of choice is the plain, old, and simple <a
+href="https://man.openbsd.org/vi">vi(1)</a>. I have the following binding
+in my <strong>~/.exrc</strong>, including a trailing space:</p>
+
+<pre><code>map qr :r !scholarref </code></pre>
+<p>The rest of my editor configuration can be found under my <a
+href="https://src.adamsgaard.dk/dotfiles/file/.exrc.html">dotfiles source
+code repository</a>.</p>
+
+<h3>vim</h3>
+<p>You can add the following bindings to <strong>~/.vimrc</strong>
+or <strong>~/.vim/vimrc</strong> in order to get scholarref functionality
+within <a href="https://www.vim.org/">vim(1)</a>:</p>
+
+<pre><code>nnoremap <leader>r :r !scholarref<space> " insert reference into current buffer
+nnoremap <leader>R :r !scholarref --add<space> " append reference into $BIB file
+</code></pre>
+
+<h3>vis</h3>
+<p>The <a href="https://github.com/martanne/vis">vis(1)</a>
+editor is an interesting combination of modal editing
+and structural regular expressions from the plan9 editor <a
+href="https://sam.cat-v.org/">sam(1)</a>. After using it exclusively for
+three months, I concluded that it is too immature for general use. I have
+the following binding in my <strong>~/.config/visrc.lua</strong>:</p>
+
+<pre><code>vis:map(vis.modes.NORMAL, leader..'r', ':< scholarref ')</code></pre>
+
+<h3>emacs</h3>
+<p>Don't know, figure it out yourself.</p>
+
+<h2>Integrating into your pdf viewer</h2>
+<p>My PDF viewer of choice is <a
+href="https://pwmt.org/projects/zathura">zathura(1)</a>, which has a
+minimal graphical user interface and is keyboard-centric. The following
+configuration calls <strong>getdoi</strong> on the currently open file
+if I press <strong>Ctrl-i</strong>. The resultant DOI is copied to the
+clipboard. Similarly, <strong>Ctrl-s</strong> tries to extract the DOI
+in the same manner, but fetches the accompanying reference and adds it
+directly to the bibliography.</p>
+
+<pre><code>map <C-i> feedkeys ":exec getdoi --notify --clip '$FILE'<Return>"
+map <C-s> feedkeys ":exec scholarref --add '$FILE'<Return>"
+</code></pre>
+
+<p>My full zathura configuration file is available <a
+href="https://src.adamsgaard.dk/dotfiles/file/.config/zathura/zathurarc.html">here</a>.</p>
+
+<h2>Questions/bugs/feedback/improvements</h2>
+<p>Please <a href="contact.html">get in touch</a> if you encounter
+any. Improvement suggestions are best sent as patches by e-mail.</p>