From: Lars Ingebrigtsen <larsi@gnus.org>
Date: Thu, 20 Jan 2022 07:38:16 +0000 (+0100)
Subject: Document textsec
X-Git-Tag: emacs-29.0.90~2917
X-Git-Url: http://git.eshelyaron.com/gitweb/?a=commitdiff_plain;h=2a3edd1e0acf00587a5243db87bf80e8383a61d8;p=emacs.git

Document textsec

* doc/lispref/elisp.texi (Top): Add menu.
* doc/lispref/text.texi (Text): Add menu.
(Suspicious Text): New node.

* lisp/international/textsec-check.el (textsec-check): Adjust doc
string.
---

diff --git a/doc/lispref/elisp.texi b/doc/lispref/elisp.texi
index 3254a4dba81..1f339ef799f 100644
--- a/doc/lispref/elisp.texi
+++ b/doc/lispref/elisp.texi
@@ -1228,6 +1228,7 @@ Text
 * Decompression::           Dealing with compressed data.
 * Base 64::                 Conversion to or from base 64 encoding.
 * Checksum/Hash::           Computing cryptographic hashes.
+* Suspicious Text::         Determining whether a string is suspicious.
 * GnuTLS Cryptography::     Cryptographic algorithms imported from GnuTLS.
 * Database::                Interacting with an SQL database.
 * Parsing HTML/XML::        Parsing HTML and XML.
diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi
index b9df66dbdb4..e94b1112d70 100644
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@@ -59,6 +59,7 @@ the character after point.
 * Decompression::    Dealing with compressed data.
 * Base 64::          Conversion to or from base 64 encoding.
 * Checksum/Hash::    Computing cryptographic hashes.
+* Suspicious Text::  Determining whether a string is suspicious.
 * GnuTLS Cryptography:: Cryptographic algorithms imported from GnuTLS.
 * Database::         Interacting with an SQL database.
 * Parsing HTML/XML:: Parsing HTML and XML.
@@ -4943,6 +4944,80 @@ It should be somewhat more efficient on larger buffers than
 @c according to what we find useful.
 @end defun
 
+@node Suspicious Text
+@section Suspicious Text
+
+Emacs can display data from many external sources, like mail and web
+pages.  Attackers may attempt to confuse the user reading this data by
+using obfuscated @acronym{URL}s or email addresses, and tricking the
+user into visiting a web page they didn't intend to visit, or sending
+an email to the wrong address.
+
+This usually involves using characters from scripts that visually look
+like @acronym{ASCII} characters (i.e., are homoglyphs), but there are
+also other techniques used, like using bidirectional overrides, or
+having an @acronym{HTML} link text that says one thing, while the
+underlying @acronym{URL} points somewhere else.
+
+To help identify these @dfn{suspicious strings}, Emacs provides a
+library to do a number of checks.  (See
+@url{https://www.unicode.org/reports/tr39/} for the rationale behind
+the checks that are available.)  Packages that present data that might
+be suspicious should use this library.
+
+@vindex textsec-check
+@defun textsec-check object type
+This function is the high-level interface function that packages
+should use.  It respects the @code{textsec-check} user option, which
+allows the user to disable the checks.
+
+This function checks @var{object} to see if it looks suspicious when
+interpreted as a thing of @var{type}.  The available types are:
+
+@table @code
+@item domain
+Check whether a domain (e.g., @samp{www.gnu.org} looks suspicious.
+
+@item url
+Check whether an @acronym{URL} (e.g., @samp{http://gnu.org/foo/bar})
+looks suspicious.
+
+@item link
+Check whether an @acronym{HTML} link (e.g., @samp{<a
+href='http://gnu.org'>fsf.org</a>} looks suspicious.  In this case,
+@var{object} should be a @code{cons} cell where the @code{car} is the
+@acronym{URL} and the @code{cdr} is the link text.  The link is deemed
+suspicious if the link text contains a domain name, and that domain
+name points to something other than the @acronym{URL}.
+
+@item email-address
+Check whether an email address (e.g., @samp{foo@@example.org}) looks
+suspicious.
+
+@item local-address
+Check whether the local part of an email address (the bit before the
+@samp{@@} sign) looks suspicious.
+
+@item name
+Check whether a name (used in an email address header) looks suspicious.
+
+@item email-address-header
+Check whether a full RFC2822 email address header (e.g.,
+@samp{=?utf-8?Q?=C3=81?= <foo@@example.com>}) looks suspicious.
+@end table
+
+If @var{object} is suspicious, this function will return a string that
+explains why it is suspicious.  If @var{object} is not suspicious, it
+returns @code{nil}.
+@end defun
+
+If the text is suspicious, the application should mark the suspicious
+text with the @code{textsec-suspicious} face, and make the explanation
+returned by @code{textsec-check} available to the user.  The
+application might also prompt the user before taking any action on a
+suspicious string (like sending an email to a suspicious email
+address).
+
 @node GnuTLS Cryptography
 @section GnuTLS Cryptography
 @cindex MD5 checksum
diff --git a/etc/NEWS b/etc/NEWS
index ac3b1dccf9c..d3abe349f2a 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -960,6 +960,7 @@ The input must be encoded text.
 
 * Lisp Changes in Emacs 29.1
 
+--
 ** New function 'bidi-string-strip-control-characters'.
 This utility function is meant for displaying strings when it's
 essential that there's no bidirectional context.
@@ -1007,6 +1008,32 @@ This event is sent when a user peforms a pinch gesture on a touchpad,
 which is comprised of placing two fingers on the touchpad and moving
 them towards or away from each other.
 
+** Text security and suspiciousness
+
++++
+*** New library textsec.el.
+This library contains a number of checks for whether a string is
+"suspicious".  This usually means that the string contains characters
+that have glyphs that can be confused with other, more commonly used
+glyphs, or contain bidirectional (or other) formatting characters that
+may be used to confuse a user.
+
++++
+*** New user option 'textsec-check'.
+If non-nil (which is the default), Emacs packages that are vulnerable
+to attackers trying to confuse the users will use the textsec library
+to mark suspicious text.  For instance shr/eww will mark suspicious
+URLs and links, and Gnus will mark suspicious From addresses, and
+Message will query the user if the user is sending mail to a
+suspicious address.  If this variable is nil, these checks aren't
+performed.
+
++++
+*** New function 'textsec-check'.
+This is the main function Emacs applications should be using to check
+whether a string is suspicious.  It heeds the 'textsec-check' user
+option.
+
 ** Keymaps and key definitions
 
 +++
diff --git a/lisp/international/textsec-check.el b/lisp/international/textsec-check.el
index 8f641e5a66d..f61cc82b5b2 100644
--- a/lisp/international/textsec-check.el
+++ b/lisp/international/textsec-check.el
@@ -39,13 +39,13 @@ If nil, these checks are disabled."
   "Face used to highlight suspicious strings.")
 
 ;;;###autoload
-(defun textsec-check (string type)
-  "Test whether STRING is suspicious when considered as TYPE.
-If STRING is suspicious, a string explaining the possible problem
+(defun textsec-check (object type)
+  "Test whether OBJECT is suspicious when considered as TYPE.
+If OBJECT is suspicious, a string explaining the possible problem
 is returned.
 
 Available types include `url', `link', `domain', `local-address',
-`name', `email-address', and `email-address-headers'.
+`name', `email-address', and `email-address-header'.
 
 If the `textsec-check' user option is nil, these checks are
 disabled, and this function always returns nil."
@@ -55,23 +55,7 @@ disabled, and this function always returns nil."
     (let ((func (intern (format "textsec-%s-suspicious-p" type))))
       (unless (fboundp func)
         (error "%s is not a valid function" func))
-      (funcall func string))))
-
-;;;###autoload
-(defun textsec-propertize (string type)
-  "Test whether STRING is suspicious when considered as TYPE.
-If STRING is suspicious, text properties will be added to the
-string to mark it as suspicious, and with tooltip texts that says
-what's suspicious about it.  Otherwise STRING is returned
-verbatim.
-
-See `texsec-check' for further information about TYPE."
-  (let ((warning (textsec-check string type)))
-    (if (not warning)
-        string
-      (propertize string
-                  'face 'textsec-suspicious
-                  'help-echo warning))))
+      (funcall func object))))
 
 (provide 'textsec-check)