`refex.search`¶

Entry Points¶

rewrite_string(searcher: refex.search.AbstractSearcher, source: str, path: str, max_iterations=1) → str¶: Applies any replacements to the input source, and returns the result.

find_iter(searcher: refex.search.AbstractSearcher, data: str, path: str, max_iterations: int = 1) → Iterable[refex.substitution.Substitution]¶

Finds all search results as an iterable of Substitutions.

Parameters

searcher – The AbstractSearcher to run.
data – The data to search in.
path – The path of the data on disk.
max_iterations – The number of times to try applying and re-applying replacements from the searcher to generate new results. There will always be at least one application.

Yields

Substitutions.

Raises

SkipFileError – This file was skipped and not searched at all.

Searchers¶

ROOT_LABEL: str¶: The root label, which exists as a span in every returned Substitution.

exception SkipFileError¶

Exception raised to halt processing and skip this file.

If this was due to an error (i.e. not a SkipFileNoResultsError), it will generally be presented as a diagnostic to the end user.

exception SkipFileNoResultsError¶

Bases: refex.search.SkipFileError

Exception raised to skip this file because it will not have any results.

This is not, strictly speaking, an error, just an exceptional case and optimization.

Base Classes¶

class AbstractSearcher¶

A class which finds search/replace results.

parse(data: str, path: str)¶: Parses the data into a representation usable by the searcher.

abstract find_iter_parsed(parsed: refex.parsed_file.ParsedFile) → Iterable[refex.substitution.Substitution]¶

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

check_is_included(path: str) → None¶: Raises SkipFileError if a path should not be searched..

abstract approximate_regex() → Optional[str]¶

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

class WrappedSearcher(searcher: refex.search.AbstractSearcher)¶

Bases: refex.search.AbstractSearcher

Forwards everything to a wrapped searcher.

Subclasses can override methods to intercept and manipulate calls. By default, calls are forwarded to searcher.

searcher¶: the wrapped searcher.

class BaseRewritingSearcher¶

Bases: refex.search.AbstractSearcher

A base class for matchers which rewrite via templates.

This is the normal case, and almost all searchers should be written as a :class`BaseRewritingSearcher`.

The templates map matched spans to a template for the replacement. Every match must have a single root label defining the overall match, keyed by ROOT_LABEL.

For example, to replace the entire match with the empty string, equivalent to --sub='' on the command line, one might use:

{ROOT_LABEL: formatting.ShTemplate('')}

Whereas to only replace the ‘a’ span with the empty string, but leave the remainder untouched, like --named-sub=a='', one would instead use:

{'a': formatting.ShTemplate('')}

abstract find_dicts_parsed(parsed: refex.parsed_file.ParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶

Finds all match/replacement pairs, as an iterable of pairs of dicts.

Parameters: parsed – the return value of a call to parse()
Returns: An iterable of (matches, replacements). matches maps labels to Span objects, replacements maps labels to templates. ROOT_LABEL must be included in every matches dict.

key_span_for_dict(parsed: refex.parsed_file.ParsedFile, match_dict: Mapping[Union[str, int], refex.match.Match]) → Optional[Tuple[int, int]]¶: Returns the key_span that the final Substitution will have.

find_iter_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution]¶

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

class BasePythonSearcher¶

Bases: refex.search.AbstractSearcher

Python searcher base class which defines parsing logic.

parse(data: str, filename: str)¶: Returns a refex.python.matcher.PythonParsedFile.

approximate_regex()¶: Returns None (no approximation).

class BasePythonRewritingSearcher(matcher: refex.python.matcher.Matcher)¶

Bases: refex.search.BasePythonSearcher, refex.search.BaseRewritingSearcher

Searcher class using :mod``refex.python.matchers``.

classmethod from_matcher(matcher, templates: Dict[str, refex.formatting.Template])¶: Creates a searcher from an evaluated matcher, and adds a root label.

find_dicts_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶

Finds all match/replacement pairs, as an iterable of pairs of dicts.

Parameters: parsed – the return value of a call to parse()
Returns: An iterable of (matches, replacements). matches maps labels to Span objects, replacements maps labels to templates. ROOT_LABEL must be included in every matches dict.

key_span_for_dict(parsed: refex.python.matcher.PythonParsedFile, match_dict: Dict[str, refex.match.Match])¶

Returns a grouping span for the containing simple AST node.

Substitutions that lie within a simple statement or expression are grouped together and mapped to the span of the largest simple node they are a part of. Every other substitution is mapped to None.

The idea here is that we want easy bite-sized chunks that are useful for quickly checking parseability, and for re-running the fixers over that chunk. Simple statements like import and return, as well as expressions that are part of larger statements, are perfect for this.

Parameters

parsed – The ParsedFile for the same file.
match_dict – The match dict.

Returns

A grouping key, or None.

class FileRegexFilteredSearcher¶

Bases: refex.search.AbstractSearcher

Base class for classes that filter files based on a regex.

Instances should have an immutable include_regex attribute. Only files with paths matching the that regular expression will pass the check_is_included check.

If other classes are mixed in which define a check_is_included method, this takes the conjunction, and only matches the filename if the other classes agree.

Wrappers¶

class PragmaSuppressedSearcher(searcher: refex.search.AbstractSearcher)¶

Bases: refex.search.WrappedSearcher

Automatically suppresses Substitutions based on pragmas in the file.

class AlsoRegexpSearcher(searcher: refex.search.AbstractSearcher, also=(), also_not=())¶

Bases: refex.search.WrappedSearcher

Only yields any results if additional regexes are satisfied.

If the provided regexes don’t match the file when they are supposed to, the file will not be considered further.

parse(data, path)¶: Parses the data into a representation usable by the searcher.

class CombinedSearcher(searchers)¶

Bases: refex.search.AbstractSearcher

Searcher which combines the results of multiple sub-searchers.

Note: all searchers must share compatible ~parsed_file.ParsedFile types. See the parse() docstring for requirements.

parse(data: str, filename: str)¶

Parses using each sub-searcher, returning the most specific parsed file.

Here “Most Specific” means the most specific subclass.

This places strong requirements on the searchers:

values returned by one parse() method should always be usable in place of the value returned by another, if they return the same type, or if the type of the first is a subclass of the type of the other.

Ideally, for performance, values should be cached.

Parameters

data – The data to be parsed.
filename – The name of the file.

Returns

The merged / most specific parsed file.

check_is_included(*args, **kwargs)¶: Only includes a file if all sub-searchers include it.

approximate_regex()¶

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

find_iter_parsed(parsed)¶: Returns all disjoint substitutions for parsed, in sorted order.

Concrete Searchers¶

class RegexSearcher(templates: Dict[str, refex.formatting.Template], compiled)¶

Bases: refex.search.BaseRewritingSearcher

Searcher class using regular expressions.

Parameters: compiled – A compiled regex.

class PyMatcherRewritingSearcher(matcher: refex.python.matcher.Matcher)¶

Bases: refex.search.BasePythonRewritingSearcher

Parses the pattern as a --mode=py matcher.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyMatcherRewritingSearcher¶: Creates a searcher from a --mode=py matcher.

class PyExprRewritingSearcher(matcher: refex.python.matcher.Matcher)¶

Bases: refex.search.BasePythonRewritingSearcher

Parses the pattern as a --mode=py.expr template.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyExprRewritingSearcher¶: Creates a searcher from a --mode=py.expr template.

class PyStmtRewritingSearcher(matcher: refex.python.matcher.Matcher)¶

Bases: refex.search.BasePythonRewritingSearcher

Parses the pattern as a --mode=py.stmt template.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyStmtRewritingSearcher¶: Creates a searcher from a --mode=py.stmt template.

exception SkipFileError

Exception raised to halt processing and skip this file.

If this was due to an error (i.e. not a SkipFileNoResultsError), it will generally be presented as a diagnostic to the end user.

exception SkipFileNoResultsError

Exception raised to skip this file because it will not have any results.

This is not, strictly speaking, an error, just an exceptional case and optimization.

default_compile_regex(r: str) → Pattern[str]¶: Compiles a regex with useful flags, and raises ValueError on failure.

find_iter(searcher: refex.search.AbstractSearcher, data: str, path: str, max_iterations: int = 1) → Iterable[refex.substitution.Substitution]

Finds all search results as an iterable of Substitutions.

Parameters

searcher – The AbstractSearcher to run.
data – The data to search in.
path – The path of the data on disk.
max_iterations – The number of times to try applying and re-applying replacements from the searcher to generate new results. There will always be at least one application.

Yields

Substitutions.

Raises

SkipFileError – This file was skipped and not searched at all.

class AbstractSearcher

A class which finds search/replace results.

parse(data: str, path: str): Parses the data into a representation usable by the searcher.

abstract find_iter_parsed(parsed: refex.parsed_file.ParsedFile) → Iterable[refex.substitution.Substitution]

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

check_is_included(path: str) → None: Raises SkipFileError if a path should not be searched..

abstract approximate_regex() → Optional[str]

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

class WrappedSearcher(searcher: refex.search.AbstractSearcher)

Forwards everything to a wrapped searcher.

Subclasses can override methods to intercept and manipulate calls. By default, calls are forwarded to searcher.

searcher: the wrapped searcher.

parse(*args, **kwargs)¶: Parses the data into a representation usable by the searcher.

find_iter_parsed(*args, **kwargs)¶

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

check_is_included(*args, **kwargs)¶: Raises SkipFileError if a path should not be searched..

approximate_regex()¶

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

class PragmaSuppressedSearcher(searcher: refex.search.AbstractSearcher)

Automatically suppresses Substitutions based on pragmas in the file.

find_iter_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution]¶

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

class AlsoRegexpSearcher(searcher: refex.search.AbstractSearcher, also=(), also_not=())

Only yields any results if additional regexes are satisfied.

If the provided regexes don’t match the file when they are supposed to, the file will not be considered further.

parse(data, path): Parses the data into a representation usable by the searcher.

class CombinedSearcher(searchers)

Searcher which combines the results of multiple sub-searchers.

Note: all searchers must share compatible ~parsed_file.ParsedFile types. See the parse() docstring for requirements.

parse(data: str, filename: str)

Parses using each sub-searcher, returning the most specific parsed file.

Here “Most Specific” means the most specific subclass.

This places strong requirements on the searchers:

values returned by one parse() method should always be usable in place of the value returned by another, if they return the same type, or if the type of the first is a subclass of the type of the other.

Ideally, for performance, values should be cached.

Parameters

data – The data to be parsed.
filename – The name of the file.

Returns

The merged / most specific parsed file.

check_is_included(*args, **kwargs): Only includes a file if all sub-searchers include it.

approximate_regex()

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

find_iter_parsed(parsed): Returns all disjoint substitutions for parsed, in sorted order.

class FileRegexFilteredSearcher

Base class for classes that filter files based on a regex.

Instances should have an immutable include_regex attribute. Only files with paths matching the that regular expression will pass the check_is_included check.

If other classes are mixed in which define a check_is_included method, this takes the conjunction, and only matches the filename if the other classes agree.

include_regex = ''¶: Regex that must match the path name.

check_is_included(path: str) → None¶: Raises SkipFileError if a path should not be searched..

ROOT_LABEL = '__root': The special metavariable for the root of the match.

MESSAGE_LABEL = '__message'¶: The special metavariable for Substitution.message`

URL_LABEL = '__url'¶: The special metavariable for Substitution.url`

CATEGORY_LABEL = '__category'¶: The special metavariable for Substitution.category`

SIGNIFICANT_LABEL = '__significant'¶

The special metavariable for Substitution.significant`

This is a bit of a hack to allow significance to be represented as a substitution.

TODO: remove this in favor of a richer SubstitutionTemplate type.

class BaseRewritingSearcher

A base class for matchers which rewrite via templates.

This is the normal case, and almost all searchers should be written as a :class`BaseRewritingSearcher`.

The templates map matched spans to a template for the replacement. Every match must have a single root label defining the overall match, keyed by ROOT_LABEL.

For example, to replace the entire match with the empty string, equivalent to --sub='' on the command line, one might use:

{ROOT_LABEL: formatting.ShTemplate('')}

Whereas to only replace the ‘a’ span with the empty string, but leave the remainder untouched, like --named-sub=a='', one would instead use:

{'a': formatting.ShTemplate('')}

abstract find_dicts_parsed(parsed: refex.parsed_file.ParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]

Finds all match/replacement pairs, as an iterable of pairs of dicts.

Parameters: parsed – the return value of a call to parse()
Returns: An iterable of (matches, replacements). matches maps labels to Span objects, replacements maps labels to templates. ROOT_LABEL must be included in every matches dict.

key_span_for_dict(parsed: refex.parsed_file.ParsedFile, match_dict: Mapping[Union[str, int], refex.match.Match]) → Optional[Tuple[int, int]]: Returns the key_span that the final Substitution will have.

find_iter_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution]

Finds all matches as an iterable of Substitutions.

Parameters: parsed – The parsed data, as returned by parse().
Returns: An iterable of Substitution objects.
Raises: SkipFileError – This file was skipped and not searched at all.

class RegexSearcher(templates: Dict[str, refex.formatting.Template], compiled)

Searcher class using regular expressions.

Parameters: compiled – A compiled regex.

find_dicts_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶

Finds all match/replacement pairs, as an iterable of pairs of dicts.

Parameters: parsed – the return value of a call to parse()
Returns: An iterable of (matches, replacements). matches maps labels to Span objects, replacements maps labels to templates. ROOT_LABEL must be included in every matches dict.

approximate_regex() → str¶

Returns a regular expression that approximates the searcher (or None).

Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except .* would suffice), then it is better to return None.

Returns

Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.

The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).

class BasePythonSearcher

Python searcher base class which defines parsing logic.

parse(data: str, filename: str): Returns a refex.python.matcher.PythonParsedFile.

approximate_regex(): Returns None (no approximation).

class BasePythonRewritingSearcher(matcher: refex.python.matcher.Matcher)

Searcher class using :mod``refex.python.matchers``.

classmethod from_matcher(matcher, templates: Dict[str, refex.formatting.Template]): Creates a searcher from an evaluated matcher, and adds a root label.

find_dicts_parsed(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]

Finds all match/replacement pairs, as an iterable of pairs of dicts.

Parameters: parsed – the return value of a call to parse()
Returns: An iterable of (matches, replacements). matches maps labels to Span objects, replacements maps labels to templates. ROOT_LABEL must be included in every matches dict.

key_span_for_dict(parsed: refex.python.matcher.PythonParsedFile, match_dict: Dict[str, refex.match.Match])

Returns a grouping span for the containing simple AST node.

Substitutions that lie within a simple statement or expression are grouped together and mapped to the span of the largest simple node they are a part of. Every other substitution is mapped to None.

The idea here is that we want easy bite-sized chunks that are useful for quickly checking parseability, and for re-running the fixers over that chunk. Simple statements like import and return, as well as expressions that are part of larger statements, are perfect for this.

Parameters

parsed – The ParsedFile for the same file.
match_dict – The match dict.

Returns

A grouping key, or None.

class PyMatcherRewritingSearcher(matcher: refex.python.matcher.Matcher)

Parses the pattern as a --mode=py matcher.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyMatcherRewritingSearcher: Creates a searcher from a --mode=py matcher.

class PyExprRewritingSearcher(matcher: refex.python.matcher.Matcher)

Parses the pattern as a --mode=py.expr template.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyExprRewritingSearcher: Creates a searcher from a --mode=py.expr template.

class PyStmtRewritingSearcher(matcher: refex.python.matcher.Matcher)

Parses the pattern as a --mode=py.stmt template.

classmethod from_pattern(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyStmtRewritingSearcher: Creates a searcher from a --mode=py.stmt template.

rewrite_string(searcher: refex.search.AbstractSearcher, source: str, path: str, max_iterations=1) → str: Applies any replacements to the input source, and returns the result.

refex.search¶

Entry Points¶

Searchers¶

Base Classes¶

Wrappers¶

Concrete Searchers¶

`refex.search`¶