An Introduction to Wildcard Searches with Inspyder Power Search
This guide gives a brief introduction to using wildcard matching within Inspyder
Power Search.
What are Wildcard Searches?
Wildcard searches are a simple way of easily matching patterns of text. The pattern
can be very straight-forward, or very complex. We've included wildcard support in
Inspyder Finder to make it easier for you to find complicated items, or groups of
items, on your site.
Basic Wildcards
The basic wildcard character ('*') lets us match zero or more characters in the
search pattern. For example; '
a*.html' matches:
- "a.html"
- "ab.html"
- "abc.html"
- etc.
To match a single character, we use the question-mark character '
?'.
For example; '
a?.html' will match the following:
- aa.html
- ab.html
- ac.html
- etc.
It will
NOT match words like:
Ranges
To match a character within a range (for example, a numeric character, 0 to 9) we
can use ranges. A range is encapsulated within square brackets, [], and contains
the set of characters to search for. For example, if we wanted to match 'inventoryX.html'
where the 'X' is any numeric digit, we could write
inventory[0123456789].html.
This will match the following:
- inventory1.html
- inventory2.html
- inventory3.html
- etc.
It will
NOT match words like:
As a shorthand, the following is also valid:
inventory[0-9].html
Multiple ranges within a single set of brackets is acceptable (for example, the
letters A to F and the numbers 0 to 9). For example;
inventory[A-F0-9].html
will match
- inventory1.html
- inventoryA.html
- inventoryB.html
- inventoryF.html
It will
NOT match words like:
It's important to note that a range only matches a single character. To find inventoryXX.html,
we'd have to do the following:
inventory[0-9][0-9].html. If the
search is case sensitive, then [A-Z] and [a-z] are different ranges.
Alternates
We can specify matches that take the form "one or the other" by using alternates.
Alternates are enclosed in curly brackets, {}, and each alternate is separated by
a comma. For example, if we wanted to find all the words that start with 'ins' and
end with 'pyder' or 'ite' we would use the following syntax:
Ins{pyder,ite}.
This will match the following:
But it will
NOT match:
Literals
To match a special character 'as is' (such as '*' or '?'), then it must be preceded
with a backslash ('\') in the query text. For example; '
question\?'
will match the following:
But it will
NOT match:
The same holds true for the '{', '}', '[' and ']' characters.
Extracting Data
In addition to Wildcard matches, Power Search provides a powerful mechanism for
extracting data from websites. By including "#DataName#" in your search pattern
(where "DataName" is the name of the column you wish to extract data into), any
text found in place of "#DataName#" will be captured and saved.
For example, the title text of a webpage is stored between HTML title tags as follows:
<title>Inspyder Software Inc.</title>
We could create a Wildcard Match like:
<title>#WebpageTitle#</title>
(Remember to check "Include HTML") When we run this search on the website, each
page's title will be extracted and stored in a column called "WebpageTitle".
It's possible to include multiple fields in the Wildcard Search. For example:
<IMG*src="#Source#"*alt="#AltText#"*>
This search will extract the "src" and "alt" attributes from all images (where the
alt attribute is present). We could enhance this pattern to make the "alt" attribute
optional:
<IMG*src="#Source#"*{alt="#AltText#",}*>
By adding the Alternate syntax ("{alt="#AltText#",}") the match includes "alt="#AltText#"
OR nothing" (since nothing follows the comma).

More Tutorials