ElfOfPi

joined 1 year ago
[–] [email protected] 1 points 1 year ago

I read a reddit post saying that using cl-lib was kind of a bad thing, and I think I've always had a fear that using libraries in my config would just make it more bloated/slow Emacs down. But after all the comments here, I think I'll change my stance on that.

 

Hi Emacs community,

I'm an elisp noob, and I recently wrote a function to get the references on a wikipedia page. I plan on using it for org-mode/org-roam so I can do research faster (even though there's probably already a package for that sort of thing). Unfortunately, it's probably not as robust as I would like to think it is, as some of the dois/isbns appear to be missing in some wikipedia pages I've tested. Here it is for reference:

(defun get-wikipedia-references (subject)
  "Gets references for a wikipedia article"
  (let ((wikipedia-prefix-url "https://en.wikipedia.org/wiki/"))
    (with-current-buffer
	(url-retrieve-synchronously (concat wikipedia-prefix-url subject))
      (let* ((html-start (progn (goto-char (point-min))
				(re-search-forward "^$")))
	     (dom (libxml-parse-html-region (1+ (point)) (point-max)))
	     (result))
	(dolist (cite-tag (dom-by-tag dom 'cite) result)
	  (let ((cite-class (dom-attr cite-tag 'class)))
	    (cond ((string-search "journal" cite-class)
		   (let ((a-tag (dom-search cite-tag (lambda (tag) (string-prefix-p "https://doi.org" (dom-attr tag 'href))))))
		     (setq result (cons (cons (concat "doi:" (dom-text a-tag))
					      (let* ((cite-texts (dom-texts cite-tag))
						     (title-beg (1+ (string-search "\"" cite-texts)))
						     (title-end (string-search "\"" cite-texts (1+ title-beg))))
						(substring cite-texts title-beg title-end)
						))
					result))))
		  ((string-search "book" cite-class)
		   (let ((a-tag (dom-search cite-tag (lambda (tag) (string-prefix-p "/wiki/Special:BookSources" (dom-attr tag 'href))))))
		     (setq result (cons (cons (concat "isbn:" (dom-text (dom-child-by-tag a-tag 'bdi)))
					      (dom-text (dom-child-by-tag cite-tag 'i)))
					result))))
		  (t
		   (let ((a-tag (assoc 'a cite-tag)))
		     (setq result (cons (cons (dom-attr a-tag 'href) (dom-text a-tag)) result))))
		  ))
	  )))))

(get-wikipedia-references "Graph_traversal")
(("doi:10.1109/SFCS.1979.34" . "Random walks, universal traversal sequences, and the complexity of maze problems")
 ("doi:10.1016/j.tcs.2015.11.017" . "Lower and upper competitive bounds for online directed graph exploration")
 ("doi:10.1016/j.tcs.2020.06.007" . "Online graph exploration on a restricted graph class: Optimal solutions for tadpole graphs")
 ("doi:10.1587/transinf.E92.D.1620" . "The Online Graph Exploration Problem on Restricted Graphs")
 ("doi:10.1016/j.tcs.2021.04.003" . "An improved lower bound for competitive graph exploration")
 ("doi:10.1137/0206041" . "An Analysis of Several Heuristics for the Traveling Salesman Problem"))

And yes, I know that I could probably use a library like s, dash, seq, or cl, but I try to keep my elisp functions free of those kind of things. I would appreciate any criticism from the Emacs community about my elisp!