Skip to content

The webproxy2.patterns file

Updated  by bduncan@iii.com

Problem

Some functionality in proxied resources is missing or challenged, and WAM forward table edits do not help.

Solution

The webproxy2.patterns file can be used to a limited extent to deal with some issues that come up in the course of accessing some resources via WAM. The file works by matching elements in the browser's source code (what you see when you "view source" of a page); WAM can be told to either replace the match with something else, or skip rewriting the code from the match point to some other match point. Care must be taken to make sure the match point is unique to the resource, or what fixes one resource might break others.

How it works

The file contains multiple paired entries, with each entry on two lines; the pairs can be either Replace Pairs or Skip Pairs. The replacing and skipping refers to WAM's behavior when rewriting the page code.

 

Replace Pairs

MATCH:[matchValue]
REPLACE:[replaceValue]

[matchValue] -- The string to match against.
[replaceValue] -- The string used to replace [matchValue].

The replaceValue element accepts the following variables:

^d - the proxy domain name
^h - the proxy host name
^i - the proxy port number
^s - the proxy secure port

 

Example:

MATCH:var statsDomain = ".grolier.com";
REPLACE:var statsDomain = ".^d";

This pair replaces var statsDomain = ".grolier.com"; in the source code with var statsDomain = ". followed by the proxy domain name then "; For example, if your system is libcat.college.edu, the above pair would find all occurrences of var statsDomain = ".grolier.com"; in the source code, and replace them with var statsDomain = ".college.edu";


Skip Pairs


MATCH:[matchValue]
SKIPTO:[skipto]

[matchValue] -- The string to match against.
[skipto] -- The end of the code that WAM skips rewriting. If this value is not defined, WAM skips rewriting until the end of the file.

Example:

MATCH:updatePanel|ctl00_ContentPlaceHolderPath_UpdatePanelPath|
SKIPTO:endUpdatePanel

In this example, WAM would skip rewriting content from the matchValue until the string endUpdatePanel is found (and would not resume rewriting until after endUpdatePanel).

If the SKIPTO value is empty, e.g.:

MATCH:updatePanel|ctl00_ContentPlaceHolderPath_UpdatePanelPath|
SKIPTO:

...WAM will skip rewriting from the matchValue until the end of the file.

 

Where can I find the webproxy2.patterns file?

The file is located in the Live Web Server Configuration - live (liveconfig) directory, the same location for the messages.conf and wam_filter files as well as external lookup mapping files used by WebBridge.

 

How do I use it?

It's a simple text file; editing can be done using Sierra's Web Master function or Millennium Administration's Web Master mode.

Lines that begin with a hash character (#) are considered comments.

Changes are immediate as soon as the file is saved -- no WebPAC restart is necessary. You will need to shift-refresh your browser to load a fresh copy of the page being viewed.

 

How do I know what to match against and what to replace?

This is the tough part. You need to be agile with analyzing what you see when you "view source" in a browser. The hope is that something you see will help identify what is causing the problem. If it looks like the problem can be dealt with using webproxy2.patterns, then it's a matter of determining the appropriate match points followed by a fair amount of trial and error.

The most important thing to keep in mind is that the match must be unique to the resource -- you can't, for example, match on <head> and skipto </head>, as that would affect every single resource. It is also important to make sure that the pairs entered into webproxy2.patterns affect only the problem you're trying to address, and don't have ill effects on other functionality within the resource.

Note that the file generally cannot be used to unproxy something that is proxied, other than as described above. For example, it cannot be used to achieve the EZProxy "NeverProxy" behavior vendors often suggest as a workaround for some problems.

 

Real world examples:

 

Many resources are using protocol-relative URLs, which WAM cannot proxy. The following pairs deal with problems with Standard & Poor's NetAdvantage, but the principle is the same with other resources.

# for S&P protocol-relative URLs:
MATCH:src="//solutions.standardandpoors.com
REPLACE:src="//0-solutions.standardandpoors.com.^h
MATCH:href="//www.netadvantage.standardandpoors.com
REPLACE:href="//0-www.netadvantage.standardandpoors.com.^h

 

The display of Taylor & Francis journals via WAM is usually all messed up because the vendor doesn't like the way the tilde in some filenames is encoded by WAM. Luckily, the files do not need to be proxied; the following pairs unproxy them, leaving the tilde intact.

# for taylor&francis
MATCH:publications og tags
SKIPTO:</head>
MATCH:href="/wro/aoay~product.css
REPLACE:href="http://www.tandfonline.com/wro/aoay~product.css
MATCH:src="/wro/aoay~product.js
REPLACE:src="http://www.tandfonline.com/wro/aoay~product.js

Note that the above pairs are only valid as long as the filenames begin with aoay. This value has been known to change periodically, which requires checking the source code to see what the new value is.

 

Both Emerald Journals and ACS use code similar to that used by Taylor & Francis. Without the following adjustments, the JavaScript responsible for expanding and collapsing journal issues (and other functionality) won't work. As with Taylor & Francis, the filenames change periodically, requiring edits to webproxy2.patterns.

# for emerald and ACS:
MATCH:<title>Emerald Insight
SKIPTO:var _gaq = _gaq || [];
MATCH:EmeraldInsight</title>
SKIPTO:var _gaq = _gaq || [];
MATCH:href="/wro/al63~product.css
REPLACE:href="http://www.emeraldinsight.com/wro/al63~product.css
MATCH:src="/wro/al63~product.js
REPLACE:src="http://www.emeraldinsight.com/wro/al63~product.js

 

Hoovers has some functionality that relies on domain recognition, which fails when the domain is rewritten.

# for hoovers domain mismatch
MATCH:if(isValidDomain(document.domain, "subscriber.hoovers.com
REPLACE:if(isValidDomain(document.domain, "0-subscriber.hoovers.com.^h