利用Lisp教導電腦寫作

Home PDF

這篇文章最早以中文撰寫並在CSDN上發表,網址為https://blog.csdn.net/lzw_java/article/details/11599993


大部分的程式碼和概念均基於《Ansi Common Lisp》P138~P141。

問題:給定一段英文文本,計算機如何能夠根據它生成隨機但可讀的文本?例如:

國家創業投資協會估計,與交易相關的財富是由規範的大量開支決定的,這些規範將花費彼此之間的主要原因是這些項目。

這是由計算機根據學習一些保羅·格雷厄姆的文章後生成的隨機文字。它從單詞「Venture」延伸到一個句子。驚人的是,這些文字通常是可以讀懂的。

演算法:記錄每個單詞後出現的單詞以及它們出現的次數。例如,如果「I leave」在原始文本中出現5次,「I want」出現3次,且「I」在任何其他單詞之前都沒有出現,然後在生成隨機文本時,當遇到「I」時,選擇「leave」作為下一個單詞的概率為5/8。如果選擇了「leave」,則檢查在「leave」後出現的單詞並重複該過程。

現在,讓我們用Lisp來解決這個問題。

Lisp的符號型別可以很好地記錄各種字串和標點符號,因此我們將使用它來記錄。我們將使用內建的hashtable來創建一個清單:

(defparameter *words* (make-hash-table :size 10000))

我們如何創建這個清單?

(let ((prev '|.|))
  (defun see (sym)
    (let ((pair (assoc sym (gethash prev *words*))))
      (if pair
	  (incf (cdr pair))
	  (push (cons sym 1) (gethash prev *words*))))
    (setf prev sym)))
當前的單詞是關鍵字,assoc清單是該關鍵字下的值。例如,在「I」下我們有(( leave . 5) ( want . 3))。如果單詞不存在,則push(單詞 . 1)。

我們如何隨機選擇一個單詞?

(defun random-word (word ht)
  (let* ((choices (gethash word ht))
	 (x (random (reduce #'+ choices :key #'cdr))))
    (dolist (pair choices)
      (decf x (cdr pair))
      (if (minusp x)
	  (return (car pair))))))

這裡,reduce函數被巧妙地使用。

現在,讓我們思考如何將給定的單詞在兩邊擴展成一個句子?

1) 反轉文本以獲取反轉的清單,即「I leave, I want」變為「leave I, want I」。

2) 反轉hashtable以獲取另一個hashtable,其中後面的單詞是鍵,可能的前面單詞及其計數形成一個assoc清單。

3) 試試你的運氣,從標點符號開始擴展句子,直到出現給定的單詞。

我使用了第二種方法:

(defparameter *r-words* (make-hash-table :size 10000))

(defun push-words (w1 w2 n)
  (push (cons w2 n) (gethash w1 *r-words*)))

(defun get-reversed-words () ; a cat -> cat a
  (maphash #'(lambda (k lst)
	       (dolist (pair lst)
		 (push-words (car pair) k (cdr pair))))
	   *words*))

遍歷原始hashtable,然後將每對單詞插入到另一個hashtable中,反轉它們的順序。以下是自動生成雙向擴展句子的程式碼:

(defparameter *words* (make-hash-table :size 10000))
(defconstant maxword 100)
(defparameter nwords 0)
(defconstant debug nil)
(let ((prev '|.|))
  (defun see (sym)
    (incf nwords)
    (let ((pair (assoc sym (gethash prev *words*))))
      (if pair
	  (incf (cdr pair))
	  (push (cons sym 1) (gethash prev *words*))))
    (setf prev sym)))

(defun check-punc (c) ; char to symbol
  (case c
    (#\. '|.|) (#\, '|,|)
    (#\; '|;|) (#\? '|?|)
    (#\: '|:|) (#\! '|!|)))

(defun read-text (pathname)
  (with-open-file (str pathname :direction :input)
    (let ((buf (make-string maxword))
	  (pos 0))
      (do ((c (read-char str nil 'eof)
	      (read-char str nil 'eof)))
	  ((eql c 'eof))
	(if (or (alpha-char-p c)
		(eql c #\'))
	    (progn
	      (setf (char buf pos) c)
	      (incf pos))
	    (progn
	      (unless (zerop pos)
		(see (intern (subseq buf 0 pos)))
		(setf pos 0))
	      (let ((punc (check-punc c)))
		(if punc
		    (see punc)))))))))

(defun print-ht (ht)
  (maphash #'(lambda (k v)
		(format t "~A ~A~%" k v))
	     ht))

(defparameter *r-words* (make-hash-table :size 10000))

(defun push-words (w1 w2 n)
  (push (cons w2 n) (gethash w1 *r-words*)))

(defun get-reversed-words () ; a cat -> cat a
  (maphash #'(lambda (k lst)
	       (dolist (pair lst)
		 (push-words (car pair) k (cdr pair))))
	   *words*))

(defun print-a-word (word ht)
  (maphash #'(lambda (k lst)
	       (if (eql k word)
		   (format t "~A ~A~%" k lst)))
	   ht))

(if debug
    (print-a-word '|leave| *r-words*))

(defun punc-p (sym) ; symbol to char, nil when fails.
  (check-punc (char (symbol-name sym) 0)))

(defun random-word (word ht)
  (let* ((choices (gethash word ht))
	 (x (random (reduce #'+ choices :key #'cdr))))
    (dolist (pair choices)
      (decf x (cdr pair))
      (if (minusp x)
	  (return (car pair))))))

(defun gen-former (word str)
  (let ((last (random-word word *r-words*)))
    (if (not (punc-p last))
	(progn
	  (gen-former last str)
	  (format str "~A " last)))))

(defun gen-latter (word str)
  (let ((next (random-word word *words*)))
    (format str "~A " next)
    (if (not (punc-p next))
        (gen-latter next str))))

;(gen-latter '|leave| t)

(defun get-a-word (ht) ; get a random word
  (let ((x (random nwords)))
    (maphash #'(lambda (k v)
		 (dolist (pair v)
		   (decf x (cdr pair))
		   (if (minusp x)
		       (return-from get-a-word (car pair)))))
	     ht)))
;(get-a-word *words*)
(defun gen-sentence (word str)
  (gen-former word str)
  (format str "~A " word)
  (gen-latter word str))

(defun test ()
  (setf nwords 0)
  (read-text "essay.txt")
  (get-reversed-words)
  (let ((word (get-a-word *words*)))
    (print word)
    (gen-sentence word t)))
(test)

Back 2025.02.22 Donate