Words in English: Parsing Practice – Rice University
To parse a word means to analyze it into component morphemes. Recall
that morphemes are the smallest units in a language that link a form
with a meaning or function.
Parsing is generally done on complex words that came from Latin and
Greek. (We call such words “Latinate” vocabulary or “Classical”
vocabulary. ) Such words typically show the clearest word structure,
in part because Latin and Greek had many affixes for inflection and
and (unlike in Germanic) their word structure REQUIRED putting together roots with affixes.
Further, many Classical words were
coined long after the classical period, so the word structure is more
transparent than words from English or French that have been in the
language so long that their morphological structure has become
murky. With many native and nativized words, what were once
separate morphemes have over a long period of time fused together.
For example, the native word stirrup comes from stig ‘climb’ +
rap ‘rope’. The word meant in Old English ‘loop of rope for
placing the foot to climb on a horse’. This word was a compound in Old English, with two separate
morphemes, but now it is a single, unanalyzable morpheme with the
modern meaning ‘device for holding the foot when mounting and riding a
horse. ‘ The whole word now has one morpheme instead of two, and it no
longer refers specifically to rope at all.
The following example words are for parsing practice. For each
morpheme in a word:
2. below it write the morpheme’s
meaning or function. (There may be some parts of the word that are
“linking forms” without any meaning. )
To complete the parse, we state
the actual meaning of the whole word in Modern English. Note that this
meaning may be somewhat indirectly related to the component morphemes.
The ‘e’ in parentheses is only there for spelling reasons–it has no
etymological connection with the word for ‘create’ in Latin. It is
only a prompt to remind us that the morpheme /ate/ is pronounced with
a front mid vowel.
Sample words for parsing
One set of sample words comprises the phonetics terminology for our
class. For these words see Sound terminology.
apteryx hippopotamus megalith
perihelion bilabial eliminate
transliterate seminal iatrogenic
anhydrous biennial apnea
endoscopy supercilious aphelion
inculpate exophthalmic laryngoscope
anemia osculate subcutaneous
luminary amygdala polysemy
pandemic androgynous agenda
memorandum exculpate hippocampus
More sample words for parsing
confluence megalith incarnation
cryptogenic geminate phyllophagous
nyctitropism phototropic phytogenic
aphasia perigee oenophile
formicivorous apterous aliform
arachnophobia apiculture oology
galactic errant errand
Parsing vs. Etymology
Parsing is related to finding the etymology of a word, but it is a
little different because the focus is on word structure, rather than
word history. This has various consequences.
Word structure (for our purposes) includes primarily roots
and affixes. So, many of original bits of the source word, such as
inflectional morphology in the original language, are not relevant to a
For example, for hippopotamus, you mind find in a dictionary
etymology that the word comes from Greek hippos ‘horse’
followed by Greek potamos ‘river’. The dictionary etymology
might also indicate that the -us ending comes from Latin (Latin
and Greek were fairly closely related languages, and the Greek noun
inflectional ending -os is historically/etymologically the same
as Latin -us. )
In a parse, we leave out the information about what language the
word parts come from: it is not relevant for this purpose.
Even more important, we also strip the source elements down to their
roots, removing inflectional endings from the original language
that the dictionary etymology included, if they do not survive in the
The resulting parse:
hipp + o + potam + us
‘horse’ linker ‘river’ ‘noun inflection’
‘large thick-skinned herbivorous mammal living in and around tropical waters
For the definition, you have to get close enough to the modern meaning
for someone to understand the thing defined as something distinct from
similar things, but you do not need a very technically precise
definition. For our purposes ‘large African mammal living around
rivers and swamps’ would be good enough.
should preserve the part of speech of the word defined. So you would
not define somnambulant as ‘to sleep-walk’, but rather
‘sleep-walking’. It is an adjective, not a verb, so the definition
must be appropriate for an adjective.
As stated above, parsing is generally done on complex words that came
from the classical languages.
The aim in parsing is to find out the structure of the word,
isolating the meaningful elements that recur not only in this word but
in other words, so that
we can learn more of those elements and learn more words that use
Etymology, on the other hand, is more like the story of a word from
the earliest point we can trace, to its modern meaning. Etymology can
be done on any word, because all words have SOME history. Even a novel
creation like googol ‘mathematical term for 10 to the 100th
power’ has an etymology: “Novel creation of amusing-sounding
word by young son of the mathematician who defined it”. But
it wouldn’t make too much sense to try to parse googol,
because it is a simplex word, i. e. it has only one morpheme in it.
In the hippopotamus example, the parse is different from the
etymology, not only because a parse does not include the source
language of loanwords as an etymology does, but also because some
dictionary etymologies break the word down into whole source
words instead of roots, e. g. an etymology might state: “from
L. hippopotamus, from Gr. hippos ‘horse’ +
potamos ‘river’ “. (Dictionary etymologies are heavily
abbreviated and you have to figure out the abbreviations for
the dictionary you use. ) The
-os part of both of the components of the compound was just a
Greek inflectional ending signalling a certain class of masculine noun
with nominative case. It’s not in the parse because it doesn’t show up
in the word today. The -us ending of hippopotamus, on
the other hand, DOES show up in the modern word so we must take
account of it. In fact it is the Latin version of the same Greek
inflectional ending seen in hipp-os. It is enough to just gloss
it as ‘noun inflection’. Later (Ch. 9) we will learn some of the
inflectional categories of Latin and Greek which have ended up in our
To find the elements relevant to parsing, look in our textbook in
Appendix 1, starting on page 221. These elements are the pure roots
and affixes, without additional morphology, such as inflectional
morphemes that allowed them to be used in whole words in Latin and Greek.
That is what we want to use in parsing: roots and affixes.
© Suzanne Kemmer
What is another word for parse? – Synonyms – WordHippo
To examine closely or to scrutinize
To place something or someone in a particular context
To inspect carefully or in detail
To break down into basic elements (for analysis)
take to mean
figure it to be
make sense of
come to the conclusion
draw the inference
have a hunch
be of the opinion
make head or tail of
cotton on to
catch on to
throw light on
take for granted
get to the bottom of
hazard a guess
come to understand
read between the lines
get the picture
get the message
put two and two together
read between lines
take as read
be of the view
latch on to
get a fix on
get the hang of
get one’s head around
become cognizant of
get the point of
get the drift of
get the idea of
venture a guess
take a stab
take a shot
have a sneaking suspicion
get the point
get the idea
find the key to
find the answer to
find the key
take something on board
see the light
be given to understand
be led to believe
perceive the meaning of
draw a conclusion
“The classicists must have been boring their mates with this fact every four years for as long as they could parse a sentence. ”
take stock of
look at carefully
subject to an examination
make sure of
give something a once-over
give something a going-over
give something a look-see
give the once-over
go over with a fine-tooth comb
take a dekko at
check up on
cast an eye over
put to the test
try on for size
put under a microscope
carry out trials on
put something through its paces
give a tryout
make a trial run
put through the wringer
give the third degree
run your eye over
look something over
give the once over
take a look at
get a load of
look up and down
seek an answer
put the screws on
put through the mangle
put questions to
pick one’s brains
put the screws to
put through the third degree
put to the proof
find out about
ask questions about
make enquiries about
pass under review
make enquiries as to
worm something out of someone
run hands over
set an examination for
make enquiries into
conduct investigations into
make inquiries into
have a taste of
give something a whirl
give it a go
conduct an enquiry
run something up the flagpole
see how it flies
see how wind blows
send up a balloon
run idea by someone
run it up a flagpole
put someone through their paces
leave no stone unturned in
go through with a fine-tooth comb
put to trial
play around with
carry out tests
do tests on
carry out trials
see to it
have a look-see
take a gander
have a tour of
have a look round
have a look around
go on a tour of
get a bird’s-eye view of
be in no doubt
take the measure of
put a question to
request information of
seek information of
want to know
put out feelers
pop the question
test the waters
subject to an inspection
kick the tires
ferret around in
root around in
ferret about in
turn inside out
root about in
give something a check-up
ask questions of
give the third degree to
ask pointed questions
put on the hotseat
Find more words!
Use * for blank tiles (max 2)
Use * for blank spaces
Advanced Word Finder
Related Words and Phrases
What is the opposite of parse?
Sentences with the word parse
Words that rhyme with parse
What is the past tense of parse?
What is the plural of parse?
What is the adjective for parse?
What is the noun for parse?
Translations for parse
Use our Synonym Finder
5-letter Words Starting With
What is data parsing? – ScrapingBee
07 June, 2021
10 min read
Kevin worked in the web scraping industry for 10 years before co-founding ScrapingBee. He is also the author of the Java Web Scraping Handbook.
Data parsing is the process of taking data in one format and transforming it to another format. You’ll find parsers used everywhere. They are commonly used in compilers when we need to parse computer code and generate machine code.
This happens all the time when developers write code that gets run on hardware. Parsers are also present in SQL engines. SQL engines parse a SQL query, execute it, and return the results.
In the case of web scraping, this usually happens after data has been extracted from a web page via web scraping. Once you’ve scraped data from the web, the next step is making it more readable and better for analysis so that your team can use the results effectively.
Parsers are heavily used in web scraping because the raw HTML we receive isn’t easy to make sense of. We need the data changed into a format that’s interpretable by a person. That might mean generating reports from HTML strings or creating tables to show the most relevant information.
Even though there are multiple uses for parsers, the focus of this blog post will be about data parsing for web scraping because it’s an online activity that thousands of people handle every day.
How to build a data parser
Regardless of what type of data parser you choose, a good parser will figure out what information from an HTML string is useful and based on pre-defined rules. There are usually two steps to the parsing process, lexical analysis and syntactic analysis.
Lexical analysis is the first step in data parsing. It basically creates tokens from a sequence of characters that come into the parser as a string of unstructured data, like HTML. The parser makes the tokens by using lexical units like keywords and delimiters. It also ignores irrelevant information like whitespaces and comments.
After the parser has separated the data between lexical units and the irrelevant information, it discards all of the irrelevant information and passes the relevant information to the next step.
The next part of the data parsing process is syntactic analysis. This is where parse tree building happens. The parser takes the relevant tokens from the lexical analysis step and arranges them into a tree. Any further irrelevant tokens, like semicolons and curly braces, are added to the nesting structure of the tree.
Once the parse tree is finished, then you’re left with relevant information in a structured format that can be saved in any file type. There are several different ways to build a data parser, from creating one programmatically to using existing tools. It depends on your business needs, how much time you have, what your budget is, and a few other factors.
To get started, let’s take a look at HTML parsing libraries.
HTML parsing libraries
HTML parsing libraries are great for adding automation to your web scraping flow. You can connect many of these libraries to your web scraper via API calls and parse data as you receive it.
Here are a few popular HTML parsing libraries:
Scrapy or BeautifulSoup
These are libraries written in Python. BeautifulSoup is a Python library for pulling data out of HTML and XML files. Scrapy is a data parser that can also be used for web scraping. When it comes to web scraping with Python, there are a lot of options available and it depends on how hands-on you want to be.
For those that work primarily with Java, there are options for you as well. JSoup is one option. It allows you to work with real-world HTML through its API for fetching URLs and extracting and manipulating data. It acts as both a web scraper and a web parser. It can be challenging to find other Java options that are open-source, but it’s definitely worth a look.
There’s an option for Ruby as well. Take a look at Nokogiri. It allows you to work with HTML and HTML with Ruby. It has an API similar to the other packages in other languages that lets you query the data you’ve retrieved from web scraping. It adds an extra layer of security because it treats all documents as untrusted by default. Data parsing in Ruby can be tricky as it can be harder to find gems you can work with.
Now that you have an idea of what libraries are available for your web scraping and data parsing needs, let’s address a common issue with HTML parsing, regular expressions. Sometimes data isn’t well-formatted inside of an HTML tag and we need to use regular expressions to extract the data we need.
You can build regular expressions to get exactly what you need from difficult data. Tools like regex101 can be an easy way to test out whether you’re targeting the correct data or not. For example, you might want to get your data specifically from all of the paragraph tags on a web page. That regular expression might look something like this:
The syntax for regular expressions changes slightly depending on which programming language you’re working with. Most of the time, if you’re working with one of the libraries we listed above or something similar, you won’t have to worry about generating regular expressions.
If you aren’t interested in using one of those libraries, you might consider building your own parser. This can be challenging, but potentially worth the effort if you’re working with extremely complex data structures.
Building your own parser
When you need full control over how your data is parsed, building your own tool can be a powerful option. Here are a few things to consider before building your own parser.
A custom parser can be written in any programming language you like. You can make it compatible with other tools you’re using, like a web crawler or web scraper, without worrying about integration issues.
In some cases, it might be cost-effective to build your own tool. If you already have a team of developers in-house, it might not too big of a task for them to accomplish.
You have granular control over everything. If you want to target specific tags or keywords, you can do that. Any time you have an update to your strategy, you won’t have many problems with updating your data parser.
Although on the other hand, there are a few challenges that come with building your own parser.
The HTML of pages is constantly changing. This could become a maintenance issue for your developers. Unless you foresee your parsing tool becoming of huge importance to your business, taking that time from product development might not be effective.
It can be costly to build and maintain your own data parser. If you don’t have a developer team, contracting the work is an option but that could lead to step bills based on developers’ hourly rates. There’s also the cost of ramping up developers that are new to the project as they figure out how things work.
You will also need to buy, build, and maintain a server to host your custom parser on. It has to be fast enough to handle all of the data that you send through it or else you might run into issues with parsing data consistently. You’ll also have to make sure that server stays secure since you might be parsing sensitive data.
Having this level of control can be nice if data parsing is a big part of your business, otherwise, it could add more complexity than is necessary. There are plenty of reasons for wanting a custom parser, just make sure that it’s worth the investment over using an existing tool.
Parsing meta data
There’s also another way to parse web data through a website’s schema. Web schema standards are managed by, a community that promotes schema for structured data on the web. Web schema is used to help search engines understand information on web pages and provide better results.
There are many practical reasons people want to parse schema metadata. For example, companies might want to parse schema for an e-commerce product to find updated prices or descriptions. Journalists could parse certain web pages to get information for their news articles. There are also website that might aggregate data like recipes, how-to guides, and technical articles.
Schema comes in different formats. You’ll hear about JSON-LD, RDFa, and Microdata schema. These are the formats you’ll likely be parsing.
RDFa (Resource Description Framework in Attributes) is recommended by the World Wide Web Consortium (W3C). It’s used to embed RDF statements in XML and HTML. One big difference between this and the other schema types is that RDFa only defines the metasyntax for semantic tagging.
Microdata is a WHATWG HTML specification that’s used to nest metadata inside existing content on web pages. Microdata standards allow developers to design a custom vocabulary or use others like
All of these schema types are easily parsable with a number of tools across different languages. There’s a library from ScrapingHub, another from RDFLib.
We’ve covered a number of existing tools, but there are other great services available. For example, the ScrapingBee Google Search API. This tool allows you to scrape search results in real-time without worrying about server uptime or code maintainance. You only need an API key and a search query to start scraping and parsing web data.
There are many other web scraping tools, like JSoup, Puppeteer, Cheerio, or BeautifulSoup.
A few benefits of purchasing a web parser include:
Using an existing tool is low maintenance.
You don’t have to invest a lot of time with development and configurations.
You’ll have access to support that’s trained specifically to use and troubleshoot that particular tool.
Some of the downsides of purchasing a web parser include:
You won’t have granular control over everything the way your parser handles data. Although you will have some options to choose from.
It could be an expensive upfront cost.
Handling server issues will not be something you need to worry about.
Parsing data is a common task handling everything from market research to gathering data for machine learning processes. Once you’ve collected your data using a mixture of web crawling and web scraping, it will likely be in an unstructured format. This makes it hard to get insightful meaning from it.
Using a parser will help you transform this data into any format you want whether it’s JSON or CSV or any data store. You could build your own parser to morph the data into a highly specified format or you could use an existing tool to get your data quickly. Choose the option that will benefit your business the most.
Frequently Asked Questions about what is the meaning of parse
What is parse example?
Parse is defined as to break something down into its parts, particularly for study of the individual parts. An example of to parse is to break down a sentence to explain each element to someone. … Parsing breaks down words into functional units that can be converted into machine language.
What does it mean to parse your words?
To parse a word means to analyze it into component morphemes. Recall that morphemes are the smallest units in a language that link a form with a meaning or function. Parsing is generally done on complex words that came from Latin and Greek.
What is another word for parse?
What is another word for parse?examineinspectsiftdelve intodig intoquerycatechizetryseecheck up on91 more rows