Home / Products / Power Search / Wildcard Search Tutorial
This guide gives a brief introduction to using wildcard matching within Power Search.
Wildcard searches are a simple way of easily matching patterns of text. The pattern can be very straight-forward, or very complex. We've included wildcard support in Inspyder Finder to make it easier for you to find complicated items, or groups of items, on your site.
The basic wildcard character ('*') lets us match zero or more characters in the search pattern. For example; 'a*.html' matches:
To match a single character, we use the question-mark character '?'. For example; 'a?.html' will match the following:
It will NOT match words like:
To match a character within a range (for example, a numeric character, 0 to 9) we can use ranges. A range is encapsulated within square brackets, [], and contains the set of characters to search for. For example, if we wanted to match 'inventoryX.html' where the 'X' is any numeric digit, we could write inventory[0123456789].html. This will match the following:
As a shorthand, the following is also valid: inventory[0-9].html
Multiple ranges within a single set of brackets is acceptable (for example, the letters A to F and the numbers 0 to 9). For example; inventory[A-F0-9].html will match
It's important to note that a range only matches a single character. To find inventoryXX.html, we'd have to do the following: inventory[0-9][0-9].html. If the search is case sensitive, then [A-Z] and [a-z] are different ranges.
We can specify matches that take the form "one or the other" by using alternates. Alternates are enclosed in curly brackets, {}, and each alternate is separated by a comma. For example, if we wanted to find all the words that start with 'ins' and end with 'pyder' or 'ite' we would use the following syntax: Ins{pyder,ite}. This will match the following:
To match a special character 'as is' (such as '*' or '?'), then it must be preceded with a backslash ('\') in the query text. For example; 'question\?' will match the following:
But it will NOT match:
The same holds true for the '{', '}', '[' and ']' characters.
In addition to Wildcard matches, Power Search provides a powerful mechanism for extracting data from websites. By including "#DataName#" in your search pattern (where "DataName" is the name of the column you wish to extract data into), any text found in place of "#DataName#" will be captured and saved.
For example, the title text of a webpage is stored between HTML title tags as follows: <title>Inspyder Software Inc.</title>
<title>Inspyder Software Inc.</title>
We could create a Wildcard Match like: <title>#WebpageTitle#</title>
<title>#WebpageTitle#</title>
(Remember to check "Include HTML") When we run this search on the website, each page's title will be extracted and stored in a column called "WebpageTitle".
It's possible to include multiple fields in the Wildcard Search. For example: <IMG*src="#Source#"*alt="#AltText#"*>
<IMG*src="#Source#"*alt="#AltText#"*>
This search will extract the "src" and "alt" attributes from all images (where the alt attribute is present). We could enhance this pattern to make the "alt" attribute optional: <IMG*src="#Source#"*{alt="#AltText#",}*>
<IMG*src="#Source#"*{alt="#AltText#",}*>
By adding the Alternate syntax ("{alt="#AltText#",}") the match includes "alt="#AltText#" OR nothing" (since nothing follows the comma).