search/About

    Last updated: October 16th, 2023

    This page is about search and searching on this website.

    A seedful and seamful search interface

    My goal here is to provide interfaces to my website that explore “search seeds” (Griffin, 2022) and seamfulness (Eslami et al., 2016). I will try to expose points of interactions, talking explicitly about keywords and queries, suggesting search operators and keyboard shortcuts, and noting how search tools shape questions and content. I will try to see even results-of-search (Mulligan & Griffin, 2018) as seeds for future querying. I will consider how to engage with null results and endeavor to support sharing and repairing searches and search. I will situate the searching and search tools.

    How to search on this website?

    See search/Guide.

    How does search work on this website?

    Search on this website is partially an experiment/exercise/exploration in using, filling, and explaining gaps in an outdated JavaScript client-side search library: Lunr.js (GitHub). This search function currently only largely provides ‘site search’, it only searches the pages of this website (except for that content not indexed, indicated below). It does not currently search the content of PDFs hosted on this website.

    Added August 31, 2023 11:12 AM (PDT)

    Note: There is some minimal support for external searches, see “External results” (below)

    Added October 16, 2023 09:49 AM (PDT)

    Note: There is also a minimal working example of FlexSearch running: FlexSearch Working Example

    Lunr.js

    • Index last updated: 2023-06-05 18:45:36 UTC

    Lunrish

    See: /changes/2023/12/11/notes/

    noindex

    Pages not included within the index are:

    • /Users/dsg/danielsgriffin/404.html
    • /Users/dsg/danielsgriffin/index.html
    • /Users/dsg/danielsgriffin/search/index.html
    • /Users/dsg/danielsgriffin/tweets/ad-z’njinz’tum.md
    • /Users/dsg/danielsgriffin/tweets/end-of-line-hyphenation.md
    • /Users/dsg/danielsgriffin/tweets/postcolonial-localization-in-search.md
    • /Users/dsg/danielsgriffin/tweets/searching-entangled-and-enfolded.md

    stopwords

    Words not indexed by Lunr.js. These words are not distinguishing when in a query unless it is an exact phrase search (see below).

    'a', 'able', 'about', 'across', 'after', 'all', 'almost', 'also', 'am', 'among', 'an', 'and', 'any', 'are', 'as', 'at', 'be', 'because', 'been', 'but', 'by', 'can', 'cannot', 'could', 'dear', 'did', 'do', 'does', 'either', 'else', 'ever', 'every', 'for', 'from', 'get', 'got', 'had', 'has', 'have', 'he', 'her', 'hers', 'him', 'his', 'how', 'however', 'i', 'if', 'in', 'into', 'is', 'it', 'its', 'just', 'least', 'let', 'like', 'likely', 'may', 'me', 'might', 'most', 'must', 'my', 'neither', 'no', 'nor', 'not', 'of', 'off', 'often', 'on', 'only', 'or', 'other', 'our', 'own', 'rather', 'said', 'say', 'says', 'she', 'should', 'since', 'so', 'some', 'than', 'that', 'the', 'their', 'them', 'then', 'there', 'these', 'they', 'this', 'tis', 'to', 'too', 'twas', 'us', 'wants', 'was', 'we', 'were', 'what', 'when', 'where', 'which', 'while', 'who', 'whom', 'why', 'will', 'with', 'would', 'yet', 'you', 'your'

    Exact phrase searching

    Lunr.js does not directly support exact phrase searching where spaces appear between search terms. Such searching is minimally made possible in an additional function (exactSearch) in js/search/results.js that conducts a simple includes check across the documents in item_index.json.

    Sorting

    Lunr.js does not directly support sorting. It is added in simple bespoke code in js/search/results.js and js/search/serp.js. See more at sort: in search/Guide/

    Autocomplete

    This is written in vanilla JavaScript in several scripts, including: 'autocomplete.js and 'suggestions.js. These scripts manage the autocomplete suggestions-list.

    hand-curated-queries

    Hand-curated queries () are drawn from Jekyll site.data.hand_curated_queries (and include also hand_curated_query_snippets). This is accessed in the JavaScript through Liquid syntax. These queries appear as soon as the search bar is in focus.

    dynamic results

    Dynamic results () are drawn from searching the Lunr.js index while the query is being formulated. These results appear as soon as one character has been typed in the search bar.

    SERP (search engine results page)

    “Exactly # results”

    Notice this says “Exactly”. This is a commentary on the confusion introduced by how Google reports the number of search results. The count of results by Google in its SERP—as interpreted by many searchers—is inaccurate. See Randall Munroe for an explication on the XKCD blog: Trochee Chart. See also a longer explanation from Danny Sullivan from 2010—including Google’s suggested justification for no disclaimer—where he shares Matt Cutts saying “We’ve talked about the fact that results estimates are just estimates for years” (Note: there is no contextual help to the searcher on the Google SERP that this is the case, despite what Google’s People + AI Guidebook might say about Explainability + Trust).

    hand-curated snippets

    This label indicates snippets (for search results) that are manually written by the author. (These snippets are written directly into the page YAML.)

    generated snippets

    These snippets may be inaccurate. Generated snippets are marked with , include a tooltip explainer, and link here.

    These snippets appear in the search results and in the hamburger list dropdown menu ( ) on the top left of each page.

    I am implementing these to (loosely) explore the processes to develop and maintain them. This label indicates snippets in the search results that are generated by feeding strings of items in the item_index.json to OpenAI’s ‘gpt-3.5-turbo’ model with a simple prompt (full code TK, slight modification of prompt for documents that were longer than the context window and so chunked):

    # Task

    Write a search snippet that briefly summarizes the following document.

    # Content {text}

    Example of an inaccurate snippet, for search/Guide:

    This document is the official searching guide from Lunr.js and provides syntax for OR, AND, and NOT searches, as well as searching across specific fields. It also includes information on exact phrase searching, !bangs, wildcards, and boosts. generated snippet

    External results

    I’m experimenting with serving some external results, see [ type:external ]. If you own the result and do not wish to be included in my index, please tell me.

    Similarity

    Lunr.js does not directly support similarity searching. Similarity is determined here by running cosine_similarity (from sklearn) on embeddings of the pages across the website (in scripts/update_similars.py). The SERP will display the 30 “most similar” pages to the page queried. See more at similar: in search/Guide/.

    “Similar” pages are also listed in the hamburger list dropdown menu ( ) on the top left of each page (via js/similars.js). Currently the five “most similar” pages are listed.

    References

    Eslami, M., Karahalios, K., Sandvig, C., Vaccaro, K., Rickman, A., Hamilton, K., & Kirlik, A. (2016). First i "like" it, then i hide it: Folk theories of social feeds. Proceedings of the 2016 Chi Conference on Human Factors in Computing Systems, 2371–2382. https://doi.org/10.1145/2858036.2858494 [eslami2016first]

    Griffin, D. (2022). Situating web searching in data engineering: Admissions, extensions, repairs, and ownership [PhD thesis, University of California, Berkeley]. https://danielsgriffin.com/assets/griffin2022situating.pdf [griffin2022situating]

    Mulligan, D. K., & Griffin, D. (2018). Rescripting search to respect the right to truth. The Georgetown Law Technology Review, 2(2), 557–584. https://georgetownlawtechreview.org/rescripting-search-to-respect-the-right-to-truth/GLTR-07-2018/ [mulligan2018rescripting]