<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Morphology &#8211; Megaputer Intelligence</title>
	<atom:link href="https://www.megaputer.com/tag/morphology/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.megaputer.com</link>
	<description>Your Knowledge Partner</description>
	<lastBuildDate>Tue, 24 Mar 2026 00:02:52 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.0.22</generator>

<image>
	<url>https://www.megaputer.com/wp-content/uploads/favicon.png</url>
	<title>Morphology &#8211; Megaputer Intelligence</title>
	<link>https://www.megaputer.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Query languages—the Swiss army knife of information extraction</title>
		<link>https://www.megaputer.com/query-languages-the-swiss-army-knife-of-information-extraction/</link>
		<pubDate>Tue, 06 Feb 2024 05:19:41 +0000</pubDate>
		<dc:creator><![CDATA[Echo Lu]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Computational Linguistics]]></category>
		<category><![CDATA[Data Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Entity Extraction]]></category>
		<category><![CDATA[Fuzzy Matching]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Morphology]]></category>
		<category><![CDATA[Pattern Definition Language]]></category>
		<category><![CDATA[Text Analytics]]></category>

		<guid isPermaLink="false">https://www.megaputer.com/?p=35231</guid>
		<description><![CDATA[<p>Text mining, the art of extracting information from text, requires the formulation of efficient queries that retrieve information based on user input. To do this, the user requires a language for writing queries. For the most basic use cases, the language operators could be regex or string search. But while regex and string search are...</p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/query-languages-the-swiss-army-knife-of-information-extraction/">Query languages—the Swiss army knife of information extraction</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p><span style="font-weight: 400;">Text mining, the art of extracting information from text, requires the formulation of efficient queries that retrieve information based on user input. To do this, the user requires a language for writing queries. For the most basic use cases, the language operators could be regex or string search. But while regex and string search are indispensable for text mining, their utility hits a hard ceiling when semantic or meaningful search is required. They cannot, for example, capture complex entities such as human names, corporations, and drugs. For handling tasks like these, we need a more powerful query language that has semantic understanding, such as Megaputer’s PDL.</span></p>
<p><span style="font-weight: 400;">So, what is PDL, and how does it achieve semantic understanding while regex and string search do not? Let’s take a look at an example to find out.</span></p>
<p><span style="font-weight: 400;">First of all, PDL does not just search for the literal form of the word in the query: instead, it automatically extends its search to all morphological forms of the word. For example, when searching for the word “company” in financial news articles, the PDL query will not only find “company”, but also “companies,” the plural form. This feature often comes in handy, especially when the search involves a verb. Suppose that you are interested in extracting </span><i><span style="font-weight: 400;">what the CEOs said. </span></i><span style="font-weight: 400;">With regex or other substring search, you will need to list all possible verb forms such as “say,” “saying,” “says,” and “said.” With PDL, simply entering “say” in the query will automatically fetch all possible verb forms. This behavior can also be turned off by enclosing the word in the </span><i><span style="font-weight: 400;">form </span></i><span style="font-weight: 400;">function</span><span style="font-weight: 400;">,</span><span style="font-weight: 400;"> which will then restrict the search to the literal form of the word, such as in the example below.</span></p>
<p><img class="wp-image-35249 aligncenter" src="https://www.megaputer.com/wp-content/uploads/comparison_pdl-1.png" alt="" width="800" height="497" /><br />
<!-- <img class="wp-image-35232 aligncenter" src="https://www.megaputer.com/wp-content/uploads/pdl-image-1-300x186.jpg" alt="" width="710" height="440" srcset="https://www.megaputer.com/wp-content/uploads/pdl-image-1-300x186.jpg 300w, https://www.megaputer.com/wp-content/uploads/pdl-image-1-1024x636.jpg 1024w, https://www.megaputer.com/wp-content/uploads/pdl-image-1-768x477.jpg 768w, https://www.megaputer.com/wp-content/uploads/pdl-image-1-644x400.jpg 644w, https://www.megaputer.com/wp-content/uploads/pdl-image-1-600x373.jpg 600w" sizes="(max-width: 710px) 100vw, 710px" /> --></p>
<p><span style="font-weight: 400;">Another notable feature of the PDL language is its capability for users to tailor the scope of their searches using a range of built-in functions. Returning to the previous example, you may not wish to confine your search exclusively to the specific verb “say,” but rather include other synonymous verbs like “tell” or “mention.” Achieving this is straightforward with PDL – users can invoke the </span><i><span style="font-weight: 400;">synonym</span></i> <span style="font-weight: 400;">function with the verb &#8220;say,&#8221; as demonstrated in (a) below. As the subsequent results table (b) illustrates, the captured text now includes various speech verbs such as “tell,” “emphasize,” and “claim,” in addition to the word “say,” capturing them in all possible verb forms. For additional flexibility, the user can also create and modify synonym dictionaries.</span></p>
<p><img class="wp-image-35250 aligncenter" src="https://www.megaputer.com/wp-content/uploads/comparison_pdl-2.png" alt="" width="800" height="497" /><br />
<!-- <img class="wp-image-35235 aligncenter" src="https://www.megaputer.com/wp-content/uploads/pdl-image-2-300x269.jpg" alt="" width="737" height="661" srcset="https://www.megaputer.com/wp-content/uploads/pdl-image-2-300x269.jpg 300w, https://www.megaputer.com/wp-content/uploads/pdl-image-2-1024x919.jpg 1024w, https://www.megaputer.com/wp-content/uploads/pdl-image-2-768x689.jpg 768w, https://www.megaputer.com/wp-content/uploads/pdl-image-2-446x400.jpg 446w, https://www.megaputer.com/wp-content/uploads/pdl-image-2-600x538.jpg 600w" sizes="(max-width: 737px) 100vw, 737px" /> --></p>
<p><span style="font-weight: 400;">The PDL language offers various modes of information extraction, including proximity search (e.g., finding words A and B within a sentence, or within a 3-words range), syntactic relation (e.g., finding word A that is the subject or object of B), semantic relation (e.g., finding words that are synonyms/antonyms to word A), access to dictionaries and ontologies, and more. This language is expressive enough to capture complex patterns, and yet relatively easy to use, </span><span style="font-weight: 400;">having a syntax that closely resembles English. Having access to this versatile query language significantly enhances the power and quality of text mining operations.</span></p>
<p><span style="font-weight: 400;">In conclusion, PDL is a powerful and versatile query language that enables users to extract meaningful information from text with greater efficiency and accuracy than competing methods like regex or string search. Its ability to understand and capture morphological forms, synonyms, and other complex patterns makes it an indispensable tool for solving text mining tasks that require semantic understanding. By leveraging the capabilities of PDL, users can enhance their information extraction processes and gain valuable insights from their data, making it a true Swiss army knife of information extraction.</span></p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/query-languages-the-swiss-army-knife-of-information-extraction/">Query languages—the Swiss army knife of information extraction</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></content:encoded>
			</item>
		<item>
		<title>Morphology: Investigating Word Structures</title>
		<link>https://www.megaputer.com/morphology-investigating-word-structures/</link>
		<pubDate>Tue, 05 Sep 2023 20:38:43 +0000</pubDate>
		<dc:creator><![CDATA[Echo Lu]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Computational Linguistics]]></category>
		<category><![CDATA[Morphology]]></category>
		<category><![CDATA[Text Analytics]]></category>

		<guid isPermaLink="false">https://www.megaputer.com/?p=34622</guid>
		<description><![CDATA[<p>Helping data analysts create effective dashboards is a key task for User Experience designers.  Even for experienced UX designers, a great amount of effort is spent on distilling large amounts of complex information into a simple, clear storytelling report to display to the client. As PolyAnalyst software UX designers, we work hard to incorporate web reporting features that help users create meaningful and interactive dashboards. Recently, we received some general requests from our users asking us for some tips on how to beautify their reports and make them more effective. So in response, we decided to share a few tips on creating better dashboards and web reports, including some of the DO’s and DON’Ts of dashboard design. </p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/morphology-investigating-word-structures/">Morphology: Investigating Word Structures</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></description>
				<content:encoded><![CDATA[<section class="l-section wpb_row height_small"><div class="l-section-h i-cf"><div class="g-cols vc_row type_default valign_top"><div class="vc_col-sm-12 wpb_column vc_column_container"><div class="vc_column-inner"><div class="wpb_wrapper">
	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p><span style="font-weight: 400;">Morphology stands as a vital subfield of linguistics, delving into the organization of words. While words are often perceived as the fundamental units in textual analysis, they can be deconstructed further into smaller constituents known as morphemes. These morphemes, which are the elemental units of grammatical form in language, fall into distinct classifications including roots, suffixes, prefixes, and more. Through the application of word formation rules, these morphemes combine to create coherent words. In essence, just as sentences have internal structures, words also exhibit intricate internal structures consisting of morphemes, and those morphemes are combined through morphological rules, similar to how words are composed through syntactic rules.</span></p>

		</div>
	</div>

	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p><span style="font-weight: 400;">There are two important categories of morphemes: those that can stand alone (e.g., “car,” </span><span style="font-weight: 400;">“freedom,” “sing”), and those that need to be attached to other morphemes (e.g., “-ish,” “bi-,” “-er,” “-est”). The former is termed &#8220;free morphemes,&#8221; while the latter is referred to as &#8220;bound morphemes.&#8221; Bound morphemes can attach at various positions around another morpheme. Certain bound morphemes are affixed to the beginning of another morpheme (e.g., “bi” in “bi” + “weekly”), which is called a prefix. When a bound morpheme is attached to the end of another morpheme (e.g., “er” in “teach” + “er”), it is a suffix. Although there exist affixes that are inserted within a morpheme (“infix”) or surround a morpheme (“circumfix”), these morphemes are rare or non-existent in English.</span></p>

		</div>
	</div>

	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p><span style="font-weight: 400;">Morphologically complex words are decomposed into various parts such as roots and affixes. A root is a core lexical unit of a word to which affixes are attached. While a root is often a free morpheme (e.g., “teach” in “teacher”), some words have a bound morpheme as their root. For example, a word like “receive” is decomposed into “re,” a prefix, and “ceive,” a root, where the root cannot stand alone. Another important aspect of multimorphemic words is that they can further combine with other affixes to derive more complex words. Consider the word &#8220;global&#8221; as an illustration: it consists of the root &#8220;globe&#8221; and the suffix &#8220;-al.&#8221; It can further merge with additional affixes to form words like &#8220;globalize,&#8221; &#8220;globalization,&#8221; and more. These instances exemplify what is known as derivational morphology, in which added morphemes derive words with new meanings. Importantly, however, the addition of a morpheme does not always give rise to a new meaning. The so-called inflectional morpheme, primarily serving grammatical functions, does not change a word’s meaning but only alters the grammatical attributes of a word. For instance, adding the &#8220;-ed&#8221; morpheme to a verb (e.g., “walk” + “ed” -> “walked”) only modifies its grammatical form, turning the verb into its past tense form, while its meaning remains unaffected.  </span></p>

		</div>
	</div>
<div class="w-image align_center"><div class="w-image-h"><img width="512" height="253" src="https://www.megaputer.com/wp-content/uploads/morphology-investigating-word-structures.jpeg" class="attachment-large size-large" alt="This jpeg image is about examples of word structures in morphology" srcset="https://www.megaputer.com/wp-content/uploads/morphology-investigating-word-structures.jpeg 512w, https://www.megaputer.com/wp-content/uploads/morphology-investigating-word-structures-300x148.jpeg 300w" sizes="(max-width: 512px) 100vw, 512px" /></div></div><div class="w-separator size_small"></div>
	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p><span style="font-weight: 400;">In PolyAnalyst, a built-in algorithm equipped with a morphological dictionary automatically tags every token in the text with morpheme-level information such as grammatical categories or tenses. For example, verbs that end with the “ed” morpheme are automatically tagged as the past tense or the past participial form of the verb. Furthermore, PolyAnalyst offers a variety of morphological functions in the PDL query language, allowing end users to craft precise queries tailored to their specific needs. The video clip below illustrates some of these functions. As shown here, users can employ the</span> <span style="font-weight: 400;">lemma</span><span style="font-weight: 400;"> function to retrieve all word forms of a given lemma. For example, </span><span style="font-weight: 400;">lemma(face)</span><span style="font-weight: 400;"> finds all inflectional forms such as “faces,” “faced,” “facing,” and so on. Since the word “face&#8221; can function as both a noun and a verb, a user can further refine the search to locate only the forms corresponding to the noun sense; e.g., </span><span style="font-weight: 400;">lemma(noun, face)</span><span style="font-weight: 400;">. </span></p>

		</div>
	</div>
<div class="w-separator size_small"></div><div class="w-video ratio_16x9"><div class="w-video-h"><iframe src="//www.youtube.com/embed/iKG50dB8Pa8?rel=0" allowfullscreen="1"></iframe></div></div>
	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p style="text-align: center;"><em><span style="font-weight: 400;">lemma(face) and lemma(noun, face)</span></em></p>

		</div>
	</div>
<div class="w-separator size_small"></div>
	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p><span style="font-weight: 400;">Notably, PolyAnalyst defaults to a comprehensive search of all word forms of a given word. That is, if a word &#8220;facing&#8221; is included in a query, PolyAnalyst will retrieve all inflectional forms of its root “face,” capturing &#8220;face,&#8221; &#8220;faced,&#8221; “faces,” and &#8220;facing.&#8221; If a user wishes to narrow down the search to a specific word form, they can utilize the</span><span style="font-weight: 400;"> form</span> <span style="font-weight: 400;">function, which then exclusively identifies instances of the same word form as the argument; for instance, </span><span style="font-weight: 400;">form(facing)</span><span style="font-weight: 400;"> exclusively identifies instances of &#8220;facing,&#8221; and nothing else.</span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Among this series of functions, the one that covers the broadest scope is </span><span style="font-weight: 400;">singleroot</span><span style="font-weight: 400;">. This function searches not only for all possible inflectional forms of a word but also for all words that share the same root. For instance, </span><span style="font-weight: 400;">singleroot(face) </span><span style="font-weight: 400;">would match a word like &#8220;facial,&#8221; along with &#8220;face,&#8221; &#8220;faced,&#8221; &#8220;facing,&#8221; and so on.</span></p>

		</div>
	</div>
<div class="w-separator size_medium"></div><div class="w-video ratio_16x9"><div class="w-video-h"><iframe src="//www.youtube.com/embed/TPQ-dIPLGIQ?rel=0" allowfullscreen="1"></iframe></div></div>
	<div class="wpb_text_column ">
		<div class="wpb_wrapper">
			<p style="text-align: center;"><em><span style="font-weight: 400;">form(facing) and singleroot(facing)</span></em></p>

		</div>
	</div>
<div class="w-separator size_small"></div></div></div></div></div></div></section>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/morphology-investigating-word-structures/">Morphology: Investigating Word Structures</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></content:encoded>
			</item>
		<item>
		<title>PolyAnalyst introduces support for using stop list dictionaries when analyzing spelling errors</title>
		<link>https://www.megaputer.com/stop-list-dictionary-spell-check/</link>
		<pubDate>Tue, 15 May 2018 18:44:31 +0000</pubDate>
		<dc:creator><![CDATA[Jeff Palan]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Dictionaries]]></category>
		<category><![CDATA[Morphology]]></category>
		<category><![CDATA[Text Analytics]]></category>

		<guid isPermaLink="false">https://www.megaputer.com/?p=20267</guid>
		<description><![CDATA[<p>In previous builds of PolyAnalyst the only way to stop a word from showing up in the spell checker was to add it to the morphology dictionary. This would sometimes result in having to add things to the morphology dictionary that might not really belong there, such as product codes (Model ABC-XYZ) and the occasional...</p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/stop-list-dictionary-spell-check/">PolyAnalyst introduces support for using stop list dictionaries when analyzing spelling errors</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>In previous builds of <a href="https://www.megaputer.com/polyanalyst/">PolyAnalyst</a> the only way to stop a word from showing up in the spell checker was to add it to the morphology dictionary. This would sometimes result in having to add things to the morphology dictionary that might not really belong there, such as product codes (Model ABC-XYZ) and the occasional single foreign word (e.g. <em>yukata</em>).</p>
<p>In order to prevent this dictionary mismatch PolyAnalyst now includes stoplist functionality for spell checking. In other words you can define a list of words for the spellchecker to ignore without assigning them as actual English words.</p>
<p>Consider the following example of dealing with the problematic word &#8220;yukata&#8221;.</p>
<h2>Without a stop list:</h2>
<p>The word is identified as a spelling error.</p>
<h4><strong><img class="size-full wp-image-20268 aligncenter" src="https://www.megaputer.com/wp-content/uploads/stoplist_graphic.png" alt="" width="222" height="105" /></strong></h4>
<h2>With a stop list: no more &#8220;yukata&#8221;!</h2>
<p>After adding the word to a stop list, the word is no longer identified as a spelling error.</p>
<p><img class="aligncenter size-full wp-image-20269" src="https://www.megaputer.com/wp-content/uploads/yukata-graphic.png" alt="" width="274" height="107" /></p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/stop-list-dictionary-spell-check/">PolyAnalyst introduces support for using stop list dictionaries when analyzing spelling errors</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></content:encoded>
			</item>
		<item>
		<title>Introducing the partofspeech function for PolyAnalyst</title>
		<link>https://www.megaputer.com/introducing-partofspeech-function-polyanalyst/</link>
		<pubDate>Thu, 15 Jun 2017 14:08:39 +0000</pubDate>
		<dc:creator><![CDATA[Jeff Palan]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Morphology]]></category>
		<category><![CDATA[Pattern Definition Language]]></category>
		<category><![CDATA[Text Analytics]]></category>

		<guid isPermaLink="false">https://www.megaputer.com/?p=22297</guid>
		<description><![CDATA[<p>This function allows you to create search queries that reference very specific grammatical attributes.</p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/introducing-partofspeech-function-polyanalyst/">Introducing the partofspeech function for PolyAnalyst</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>Build 2010 &#8211; Released June 2017</p>
<p>As part of <a href="https://www.megaputer.com/polyanalyst/">PolyAnalyst’s</a> crusade for accuracy and ease of use, a new Pattern Definition Language (PDL) function <strong>partofspeech</strong> has been added to the function library.  This function allows you to create search queries that reference very specific grammatical attributes.</p>
<p>For example</p>
<pre>partofspeech(noun_singular)</pre>
<p>will match “desk” and “table” but not “desks” or “tables”</p>
<pre>partofspeech(noun, building)</pre>
<p>will match “building” and “buildings”, but only when they are used as nouns (as opposed to a verb, as in “I’m building a desk”)</p>
<p>It was possible to achieve similar results previously, but required more complex and esoteric queries.</p>
<p>This function can be especially useful in entity extraction. As an example, let&#8217;s run the following query on some public data from the National Highway Traffic Safety Administration.</p>
<pre>near(1, partofspeech(adjective), tire or tread)</pre>
<p>The result is all the instances where an adjective appeared next to the words “tire” or “tread”.</p>
<p><img class="size-medium wp-image-22298 aligncenter" src="https://www.megaputer.com/wp-content/uploads/part-of-speech-blog-img-1-tire-tread-300x224.png" alt="" width="300" height="224" srcset="https://www.megaputer.com/wp-content/uploads/part-of-speech-blog-img-1-tire-tread-300x224.png 300w, https://www.megaputer.com/wp-content/uploads/part-of-speech-blog-img-1-tire-tread-350x263.png 350w, https://www.megaputer.com/wp-content/uploads/part-of-speech-blog-img-1-tire-tread.png 353w" sizes="(max-width: 300px) 100vw, 300px" /></p>
<p>The post <a rel="nofollow" href="https://www.megaputer.com/introducing-partofspeech-function-polyanalyst/">Introducing the partofspeech function for PolyAnalyst</a> appeared first on <a rel="nofollow" href="https://www.megaputer.com">Megaputer Intelligence</a>.</p>
]]></content:encoded>
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.w3-edge.com/products/

Page Caching using disk: enhanced 
Minified using disk

Served from: www.megaputer.com @ 2026-06-27 06:56:28 by W3 Total Cache
-->