adamsgaard.dk

my academic webpage
git clone git://src.adamsgaard.dk/adamsgaard.dk
Log | Files | Refs | README | LICENSE Back to index

commit 91c6cf410237ba846ea1aa079ee2a4ebe14a38f0
parent 8261f49d9cdb15d18adcfa231afea6d2e861440d
Author: Anders Damsgaard <anders@adamsgaard.dk>
Date:   Fri, 15 Nov 2019 15:13:46 +0100

First completed draft of scholarref blog entry

Diffstat:
Apages/002-scholarref.cfg | 8++++++++
Apages/002-scholarref.html | 223+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 231 insertions(+), 0 deletions(-)

diff --git a/pages/002-scholarref.cfg b/pages/002-scholarref.cfg @@ -0,0 +1,8 @@ +filename=scholarref.html +title=The scholarref tools: never deal with journal webpages again +description=Fetch publications and bibliography references from the command line +id=scholarref +tags=scholarref, references, academia, bibtex, latex +created=2019-11-15 +updated=2019-11-15 +#index=0 diff --git a/pages/002-scholarref.html b/pages/002-scholarref.html @@ -0,0 +1,223 @@ +<h2>Rationale</h2> +<p>During the writing phase of an academic paper, common +tasks include downloading PDFs of publications and getting +their references into your bibliography. <a +href="https://en.wikipedia.org/wiki/Digital_object_identifier">DOIs</a> +are central to modern reference management, and many journals require that +all cited references are listed with their DOI in the bibliography. I am +not a fan of navigating the bloated and distracting webpages of academic +journals and publication aggregators. Their pages are often slow and +hard to navigate. Often, clicking the "Download PDF" link redirects the +viewer into an unusable in-browser PDF viewer instead of providing the +PDF directly. Journal webpages provide links to export a citation for the +relevant publication, but these are inconsistent in style and content.</p> + +<p>For these reasons, I constructed a set of shell tools +that allow me to perform most of the tasks without having to +open a browser. As post title indicates, the goal of the <a +href="https://src.adamsgaard.dk/scholarref">scholarref</a> tools +is to provide as much functionality a person might need during as +possible from a set of command-line utilities. The tools are under +<a href="https://src.adamsgaard.dk/scholarref/log.html">continuous +development</a>, and at present I avoid roughly 90% of visits to journal +webpages.</p> + +<p>The <strong>scholarref</strong> design goals are the following:</p> +<ul> +<li>POSIX shell scripts with minimal external dependencies: +Ensures maximum flexibility and portability.</li> +<li>Aim for simplicity: +Fewer lines of code make the programs easier to understand, maintain, +and debug.</li> +<li>Each tool should do one thing, and do it well: +Let the users piece the components together to fit their workflow.</li> +<li>Return references in BibTeX format</li> +</ul> + +<p><strong>DISCLAIMER:</strong> The functionality provided by these +programs depends on communication with third party webpages, which may +or may not be permitted by law and the terms of service upheld by the +third parties. What is demonstrated here is an example only. Use of +the tools is entirely your own responsibility.</p> + + +<h2>Installation</h2> + +<pre><code>$ git clone git://src.adamsgaard.dk/scholarref +$ cd scholarref +# make install</pre></code> + +<p>The <strong>make install</strong> command may require superuser +priviledges to install the tools to <strong>/usr/local</strong>. Prefix +with <strong>doas</strong> or <strong>sudo</strong>, whatever is +appropriate for the target system.</p> + +<h2>The scholarref toolset</h2> + +<p>The tools adhere to certain standards, namely POSIX compatibility +which ensures that the code is portable. The tools use standard UNIX +concepts of streams, which makes them modular and compatible with other +text-manipulating tools. Furthermore, I strive to keep the code as simple +and minimal as possible. Fewer lines of code means fever bugs.</p> + +<p>All programs accept input as command-line arguments or from standard +input (stdin). The programs come with several OPTIONS, and +it is encouraged to explore the program help text (option: +<strong>-h</strong>). The <strong>-t</strong> option may be of +particular interest, since it tunnels all communication through <a +href="https://torproject.org">Tor</a> via <strong>torsocks</strong>, +if available on the system.</p> + +<h3>getdoi</h3> +This tool accepts names of PDF files or arbitrary search queries. +If a PDF file name is supplied, <strong>getdoi</strong> scans the PDF +text in order to find the first occuring DOI entry, which typically is +the DOI of the publication itself. If an arbitrary query is supplied, +the <a href="http://api.crossref.org">CrossRef API</a> to find the +closest matching publication and its DOI. You can supply author names, +parts of the title, ORCID, journal name, etc. Examples:</p> + +<pre><code>$ getdoi damsgaard2018.pdf +10.1029/2018ms001299 +$ getdoi 'damsgaard sergienko adcroft journal advances modeling earth systems' +10.1029/2018ms001299 +</code></pre> + +<h3>getref</h3> +<p>The <strong>getref</strong> tool fetches the BibTeX citation for a +given DOI from <a href="https://doi.org">doi.org</a>. By default, +the journal names and author first names are abbreviated, which is +what most journals want. I have taken most abbreviations from the <a +href="https://www.library.caltech.edu/journal-title-abbreviations">Caltech +Library list of Journal Title Abbreviations</a>. My journal-title +abbreviation ruleset is incomplete, and is expanded on a per-need +basis. The abbreviation functionality can be disabled, see <strong>getref +-h</strong> for details.</p> + +<pre><code>$ getref 10.1029/2018ms001299 +@article{Damsgaard2018, + doi = {10.1029/2018ms001299}, + year = 2018, + publisher = {American Geophysical Union ({AGU})}, + volume = {10}, + number = {9}, + pages = {2228--2244}, + author = {A. Damsgaard and A. Adcroft and O. Sergienko}, + title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics}, + journal = {J. Adv. Mod. Earth Sys.} +} +$ getref -j 10.1029/2018ms001299 # do not abbreviate journal title +@article{Damsgaard2018, + doi = {10.1029/2018ms001299}, + year = 2018, + publisher = {American Geophysical Union ({AGU})}, + volume = {10}, + number = {9}, + pages = {2228--2244}, + author = {A. Damsgaard and A. Adcroft and O. Sergienko}, + title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics}, + journal = {Journal of Advances in Modeling Earth Systems} +} +</code></pre> + +<h3>shdl</h3> +<p>This tool takes a DOI as input and attempts to +download the corresponding publication as a PDF through <a +href="https://sci-hub.tw">sci-hub</a>. Unfortunately, the sci-hub web +interface often puts up captias to restrict automated downloads. If that's +the case, <strong>shdl</strong> opens the tor browser (if installed) +or the system web browser in order to manually complete the +download. Output PDF files are saved in the present working directory.</p> + + +<h2>Usage examples</h2> + +<p>If you want a BibTeX reference a search query, simply use UNIX +pipes to send the <strong>getdoi</strong> output as input for +<strong>getref</strong>:</p> + +<pre><code>$ getdoi 'damsgaard egholm ice flow dynamics' | getref +@article{Damsgaard2016, + doi = {10.1002/2016gl071579}, + year = 2016, + publisher = {American Geophysical Union ({AGU})}, + volume = {43}, + number = {23}, + pages = {12,165--12,173}, + author = {A. Damsgaard and D. L. Egholm and L. H. Beem and S. Tulaczyk and N. K. Larsen and J. A. Piotrowski and M. R. Siegfried}, + title = {Ice flow dynamics forced by water pressure variations in subglacial granular beds}, + journal = {Geophys. Res. Lett.} +} +</code></pre> + +<p>The <strong>scholarref</strong> program itself is an aggregation of +the <strong>getdoi</strong> and <strong>getref</strong> commands. If +called with the <strong>-a</strong> option, the reference +is directly inserted into the system bibliography. The full +path to the bibliography file (.bib) is assumed to be set in the +<strong>$BIB</strong> environment variable, for instance defined in the +user <strong>~/.profile</strong>.</p> + +<pre><code>$ echo $BIB +/home/ad/articles/own/BIBnew.bib +$ scholarref -a 'damsgaard egholm ice flow dynamics' +Citation Damsgaard2016 added to /home/ad/articles/own/BIBnew.bib +</code></pre> + + +<h2>Integrating into your favorite $EDITOR</h2> +<p>The <strong>scholarref</strong> tool is particularly useful if called +from within a text editor.</p> + +<h3>vi</h3> +<p>My editor of choice is the plain, old, and simple <a +href="https://man.openbsd.org/vi">vi(1)</a>. I have the following binding +in my <strong>~/.exrc</strong>, including a trailing space:</p> + +<pre><code>map qr :r !scholarref </code></pre> +<p>The rest of my editor configuration can be found under my <a +href="https://src.adamsgaard.dk/dotfiles/file/.exrc.html">dotfiles source +code repository</a>.</p> + +<h3>vim</h3> +<p>You can add the following bindings to <strong>~/.vimrc</strong> +or <strong>~/.vim/vimrc</strong> in order to get scholarref functionality +within <a href="https://www.vim.org/">vim(1)</a>:</p> + +<pre><code>nnoremap &lt;leader&gt;r :r !scholarref&lt;space&gt; " insert reference into current buffer +nnoremap &lt;leader&gt;R :r !scholarref --add&lt;space&gt; " append reference into $BIB file +</code></pre> + +<h3>vis</h3> +<p>The <a href="https://github.com/martanne/vis">vis(1)</a> +editor is an interesting combination of modal editing +and structural regular expressions from the plan9 editor <a +href="https://sam.cat-v.org/">sam(1)</a>. After using it exclusively for +three months, I concluded that it is too immature for general use. I have +the following binding in my <strong>~/.config/visrc.lua</strong>:</p> + +<pre><code>vis:map(vis.modes.NORMAL, leader..'r', ':&lt; scholarref ')</code></pre> + +<h3>emacs</h3> +<p>Don't know, figure it out yourself.</p> + +<h2>Integrating into your pdf viewer</h2> +<p>My PDF viewer of choice is <a +href="https://pwmt.org/projects/zathura">zathura(1)</a>, which has a +minimal graphical user interface and is keyboard-centric. The following +configuration calls <strong>getdoi</strong> on the currently open file +if I press <strong>Ctrl-i</strong>. The resultant DOI is copied to the +clipboard. Similarly, <strong>Ctrl-s</strong> tries to extract the DOI +in the same manner, but fetches the accompanying reference and adds it +directly to the bibliography.</p> + +<pre><code>map &lt;C-i&gt; feedkeys ":exec getdoi --notify --clip '$FILE'&lt;Return&gt;" +map &lt;C-s&gt; feedkeys ":exec scholarref --add '$FILE'&lt;Return&gt;" +</code></pre> + +<p>My full zathura configuration file is available <a +href="https://src.adamsgaard.dk/dotfiles/file/.config/zathura/zathurarc.html">here</a>.</p> + +<h2>Questions/bugs/feedback/improvements</h2> +<p>Please <a href="contact.html">get in touch</a> if you encounter +any. Improvement suggestions are best sent as patches by e-mail.</p>