Does this feel familiar?

“The paper builds very much on [??], the extension […] is conceptually a straightforward extension which has been done before several times [..]”.

I have stopped counting the number of times I’ve received reviews that claim our work has been done many times before, and yet no reference is provided. That’s not to say reviewers don’t have a point as I’m sure we overlooked some references to related work that we should have cited; however, such comments without references are not only a source of frustration but they are also, in my opinion, a bit on the dishonest side of academic behavior.

That being said, it would be too easy to blame this on the reviewer laziness and I think this is just a symptom of a bigger problem. In fact, although we spend countless hours writing papers and citing others, either through BibTeX or Endnote, we are severely lacking simple tools to quickly cite works in plain text when we fill in forms online or just write emails.

Alfred and Python to the rescue

I have become a big fan of Alfred for a while, which is nothing but a launch bar that puts spotlight on steroids. There are alternatives for other operating systems, although I haven’t tested them, e.g. Ubuntu. I have modified a workflow found here to quickly insert a reference from my BibTeX file to a technical paper in plain text. The workflow uses regexp to quickly parse a BibTeX file and allows you to quickly identify a reference and to paste it in any document. My only modification to the original workflow has been to add a few regular expression and to change the formatting of the output to suit my needs bit better.

The proof is in the pudding, see below!

My modified code is provided below.

#!/usr/bin/env python2
# encoding: utf-8
#
# Copyright (c) 2018 Dean Jackson <deanishe@deanishe.net>
#
# MIT Licence. See http://opensource.org/licenses/MIT
#
# Created on 2018-06-12
#

"""Extract BibTeX citations and show them in Alfred."""

from __future__ import print_function, absolute_import

from collections import namedtuple
import json
import os
import re
import sys

# .bib file to read is specified in workflow's configuration sheet
BIBFILE = os.path.expanduser(os.getenv('BIBFILE', ''))
ICON_WARNING = ('/System/Library/CoreServices/CoreTypes.bundle/Contents/'
                'Resources/AlertCautionIcon.icns')
    
match_key = re.compile(r'@(\w+){(\S+),').match
match_title = re.compile(r'[^book][ \t]?title[ \t]*= {(.*)},').match
match_auth = re.compile(r'[^book][ \t]?author[ \t]*= {(.*)},').match
match_year = re.compile(r'[^book][ \t]?year[ \t]*= {(.*)},').match 
math_jour = re.compile(r'[^book][ \t]?journal[ \t]*= {(.*)},').match

# Data model. Contains the title of the work, citekey and type.
CiteKey = namedtuple('CiteKey', 'title key type auth year')


def log(s, *args):
    """Write string to STDERR."""
    if args:
        s = s % args

#    if isinstance(s, unicode):
#        s = s.encode('utf-8')

    print(s, file=sys.stderr)


def extract_citekeys(bibtex):
    """Parse citekeys contained in the BibTeX."""
    keys = []
    lines = [s for s in bibtex.split('\n')]

    title = key = typ = auth = year = None

    for line in lines:
        if line == '}':
            if key and typ and auth:
            #if title and key and typ:
                keys.append(CiteKey(title, key, typ, auth, year))
            else:
                log('Invalid entry: title=%r, key=%r, typ=%r', title, key, typ)
            title = key = typ = auth = None
            continue

        m = match_key(line)
        if m:
            typ, key = m.groups()
            #log("Key: %r", line)
            continue

        
        m = match_title(line)
        if m:
            #log("Title: %r", line)
            title = m.group(1)

        m = match_auth(line)
        if m:
            auth = m.group(1)

        m = match_year(line)
        if m:
            year = m.group(1)
            
        #m = match_jour(line)
        #if m:
        #    jour = m.group(1)
        
    return keys


def print_items(*items):
    """Write items to STDOUT as JSON."""
    json.dump(dict(items=items), sys.stdout)


def warning_item(title, subtitle='Try a different query'):
    """Alfred item that shows a warning message."""
    return dict(title=title, subtitle=subtitle,
                valid=False, icon={'path': ICON_WARNING})

def clean_authors(BibTexAuthorList):
    AuthorList = BibTexAuthorList.split("and ")
    if (len(AuthorList)>2):
        FirstAuthorEtAl = AuthorList[0] + "et al."
        return(FirstAuthorEtAl)
    else:
        return(BibTexAuthorList)

def result_item(ck):
    """Create an Alfred item for a CiteKey."""
    #subtitle = u'{ck.auth} {ck.key} ({ck.type})'.format(ck=ck)
    #subtitle = u'{ck.auth} ({ck.type})'.format(ck=ck)
    subtitle = u'{auth} {year} ({type})'.format(auth = clean_authors(ck.auth), year = ck.year, type = ck.type)
    return dict(title=ck.title,
                subtitle=subtitle,
                match=u'{}, {}, {}'.format(ck.auth, ck.title, ck.key),
                #arg=u'[@{}]'.format(ck.key),
                #arg=u'\\cite}'.format(ck.key),
                arg=u'{}, {}, {}'.format(clean_authors(ck.auth), ck.title, ck.year),
                valid=True)

def my_result_item(ck):
    """Create an Alfred item for a CiteKey."""
    #subtitle = u'{ck[\'ID\']} ({ck[\'ENTRYTYPE\']})'.format(ck=ck)
    log("%s",ck)
    return dict(title=ck["title"],
                subtitle=' ',#subtitle,
                match = '{} {}'.format(ck['title'], ck['ID']),
                arg='[@{}]'.format(ck['ID']),
                valid=True)


def main():
    """Run Script Filter."""
    if not os.path.exists(BIBFILE):
        log('%s does not exist', BIBFILE)
        print_items(warning_item(BIBFILE + ' does not exist',
                                 'Check the BIBFILE setting'))
        return

    with open(BIBFILE) as fp:
        txt = fp.read()#.decode('utf-8')
        
#    with open(BIBFILE) as bibtex_file:
#        bib_database = bibtexparser.bparser.BibTexParser(common_strings=True).parse_file(bibtex_file)

    
    keys = extract_citekeys(txt)
    log('%d citekeys in %s', len(keys), BIBFILE)
    if not keys:
        print_items(warning_item('No citekeys found in file', BIBFILE))
        return

    print_items(*[result_item(ck) for ck in keys])
#    print_items(*[my_result_item(ck) for ck in bib_database.entries])


if __name__ == '__main__':
    main()