refex.search
¶
Entry Points¶
-
rewrite_string
(searcher: refex.search.AbstractSearcher, source: str, path: str, max_iterations=1) → str¶ Applies any replacements to the input source, and returns the result.
-
find_iter
(searcher: refex.search.AbstractSearcher, data: str, path: str, max_iterations: int = 1) → Iterable[refex.substitution.Substitution]¶ Finds all search results as an iterable of Substitutions.
- Parameters
searcher – The AbstractSearcher to run.
data – The data to search in.
path – The path of the data on disk.
max_iterations – The number of times to try applying and re-applying replacements from the searcher to generate new results. There will always be at least one application.
- Yields
Substitutions.
- Raises
SkipFileError – This file was skipped and not searched at all.
Searchers¶
-
ROOT_LABEL
: str¶ The root label, which exists as a span in every returned
Substitution
.
-
exception
SkipFileError
¶ Exception raised to halt processing and skip this file.
If this was due to an error (i.e. not a
SkipFileNoResultsError
), it will generally be presented as a diagnostic to the end user.
-
exception
SkipFileNoResultsError
¶ Bases:
refex.search.SkipFileError
Exception raised to skip this file because it will not have any results.
This is not, strictly speaking, an error, just an exceptional case and optimization.
Base Classes¶
-
class
AbstractSearcher
¶ A class which finds search/replace results.
-
parse
(data: str, path: str)¶ Parses the data into a representation usable by the searcher.
-
abstract
find_iter_parsed
(parsed: refex.parsed_file.ParsedFile) → Iterable[refex.substitution.Substitution]¶ Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
check_is_included
(path: str) → None¶ Raises SkipFileError if a path should not be searched..
-
abstract
approximate_regex
() → Optional[str]¶ Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
-
class
WrappedSearcher
(searcher: refex.search.AbstractSearcher)¶ Bases:
refex.search.AbstractSearcher
Forwards everything to a wrapped searcher.
Subclasses can override methods to intercept and manipulate calls. By default, calls are forwarded to
searcher
.-
searcher
¶ the wrapped searcher.
-
-
class
BaseRewritingSearcher
¶ Bases:
refex.search.AbstractSearcher
A base class for matchers which rewrite via templates.
This is the normal case, and almost all searchers should be written as a :class`BaseRewritingSearcher`.
The templates map matched spans to a template for the replacement. Every match must have a single root label defining the overall match, keyed by
ROOT_LABEL
.For example, to replace the entire match with the empty string, equivalent to
--sub=''
on the command line, one might use:{ROOT_LABEL: formatting.ShTemplate('')}
Whereas to only replace the ‘a’ span with the empty string, but leave the remainder untouched, like
--named-sub=a=''
, one would instead use:{'a': formatting.ShTemplate('')}
-
abstract
find_dicts_parsed
(parsed: refex.parsed_file.ParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶ Finds all match/replacement pairs, as an iterable of pairs of dicts.
- Parameters
parsed – the return value of a call to parse()
- Returns
An iterable of
(matches, replacements)
.matches
maps labels to Span objects,replacements
maps labels to templates.ROOT_LABEL
must be included in everymatches
dict.
-
key_span_for_dict
(parsed: refex.parsed_file.ParsedFile, match_dict: Mapping[Union[str, int], refex.match.Match]) → Optional[Tuple[int, int]]¶ Returns the
key_span
that the finalSubstitution
will have.
-
find_iter_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution]¶ Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
abstract
-
class
BasePythonSearcher
¶ Bases:
refex.search.AbstractSearcher
Python searcher base class which defines parsing logic.
-
parse
(data: str, filename: str)¶ Returns a
refex.python.matcher.PythonParsedFile
.
-
approximate_regex
()¶ Returns
None
(no approximation).
-
-
class
BasePythonRewritingSearcher
(matcher: refex.python.matcher.Matcher)¶ Bases:
refex.search.BasePythonSearcher
,refex.search.BaseRewritingSearcher
Searcher class using :mod``refex.python.matchers``.
-
classmethod
from_matcher
(matcher, templates: Dict[str, refex.formatting.Template])¶ Creates a searcher from an evaluated matcher, and adds a root label.
-
find_dicts_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶ Finds all match/replacement pairs, as an iterable of pairs of dicts.
- Parameters
parsed – the return value of a call to parse()
- Returns
An iterable of
(matches, replacements)
.matches
maps labels to Span objects,replacements
maps labels to templates.ROOT_LABEL
must be included in everymatches
dict.
-
key_span_for_dict
(parsed: refex.python.matcher.PythonParsedFile, match_dict: Dict[str, refex.match.Match])¶ Returns a grouping span for the containing simple AST node.
Substitutions that lie within a simple statement or expression are grouped together and mapped to the span of the largest simple node they are a part of. Every other substitution is mapped to None.
The idea here is that we want easy bite-sized chunks that are useful for quickly checking parseability, and for re-running the fixers over that chunk. Simple statements like import and return, as well as expressions that are part of larger statements, are perfect for this.
- Parameters
parsed – The ParsedFile for the same file.
match_dict – The match dict.
- Returns
A grouping key, or None.
-
classmethod
-
class
FileRegexFilteredSearcher
¶ Bases:
refex.search.AbstractSearcher
Base class for classes that filter files based on a regex.
Instances should have an immutable
include_regex
attribute. Only files with paths matching the that regular expression will pass the check_is_included check.If other classes are mixed in which define a
check_is_included
method, this takes the conjunction, and only matches the filename if the other classes agree.
Wrappers¶
-
class
PragmaSuppressedSearcher
(searcher: refex.search.AbstractSearcher)¶ Bases:
refex.search.WrappedSearcher
Automatically suppresses Substitutions based on pragmas in the file.
-
class
AlsoRegexpSearcher
(searcher: refex.search.AbstractSearcher, also=(), also_not=())¶ Bases:
refex.search.WrappedSearcher
Only yields any results if additional regexes are satisfied.
If the provided regexes don’t match the file when they are supposed to, the file will not be considered further.
-
parse
(data, path)¶ Parses the data into a representation usable by the searcher.
-
-
class
CombinedSearcher
(searchers)¶ Bases:
refex.search.AbstractSearcher
Searcher which combines the results of multiple sub-searchers.
Note: all searchers must share compatible
~parsed_file.ParsedFile
types. See theparse()
docstring for requirements.-
parse
(data: str, filename: str)¶ Parses using each sub-searcher, returning the most specific parsed file.
Here “Most Specific” means the most specific subclass.
This places strong requirements on the searchers:
values returned by one
parse()
method should always be usable in place of the value returned by another, if they return the same type, or if the type of the first is a subclass of the type of the other.Ideally, for performance, values should be cached.
- Parameters
data – The data to be parsed.
filename – The name of the file.
- Returns
The merged / most specific parsed file.
-
check_is_included
(*args, **kwargs)¶ Only includes a file if all sub-searchers include it.
-
approximate_regex
()¶ Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
find_iter_parsed
(parsed)¶ Returns all disjoint substitutions for parsed, in sorted order.
-
Concrete Searchers¶
-
class
RegexSearcher
(templates: Dict[str, refex.formatting.Template], compiled)¶ Bases:
refex.search.BaseRewritingSearcher
Searcher class using regular expressions.
- Parameters
compiled – A compiled regex.
-
class
PyMatcherRewritingSearcher
(matcher: refex.python.matcher.Matcher)¶ Bases:
refex.search.BasePythonRewritingSearcher
Parses the pattern as a
--mode=py
matcher.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyMatcherRewritingSearcher¶ Creates a searcher from a
--mode=py
matcher.
-
classmethod
-
class
PyExprRewritingSearcher
(matcher: refex.python.matcher.Matcher)¶ Bases:
refex.search.BasePythonRewritingSearcher
Parses the pattern as a
--mode=py.expr
template.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyExprRewritingSearcher¶ Creates a searcher from a
--mode=py.expr
template.
-
classmethod
-
class
PyStmtRewritingSearcher
(matcher: refex.python.matcher.Matcher)¶ Bases:
refex.search.BasePythonRewritingSearcher
Parses the pattern as a
--mode=py.stmt
template.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyStmtRewritingSearcher¶ Creates a searcher from a
--mode=py.stmt
template.
-
classmethod
-
exception
SkipFileError
Exception raised to halt processing and skip this file.
If this was due to an error (i.e. not a
SkipFileNoResultsError
), it will generally be presented as a diagnostic to the end user.
-
exception
SkipFileNoResultsError
Exception raised to skip this file because it will not have any results.
This is not, strictly speaking, an error, just an exceptional case and optimization.
-
default_compile_regex
(r: str) → Pattern[str]¶ Compiles a regex with useful flags, and raises ValueError on failure.
-
find_iter
(searcher: refex.search.AbstractSearcher, data: str, path: str, max_iterations: int = 1) → Iterable[refex.substitution.Substitution] Finds all search results as an iterable of Substitutions.
- Parameters
searcher – The AbstractSearcher to run.
data – The data to search in.
path – The path of the data on disk.
max_iterations – The number of times to try applying and re-applying replacements from the searcher to generate new results. There will always be at least one application.
- Yields
Substitutions.
- Raises
SkipFileError – This file was skipped and not searched at all.
-
class
AbstractSearcher
A class which finds search/replace results.
-
parse
(data: str, path: str) Parses the data into a representation usable by the searcher.
-
abstract
find_iter_parsed
(parsed: refex.parsed_file.ParsedFile) → Iterable[refex.substitution.Substitution] Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
check_is_included
(path: str) → None Raises SkipFileError if a path should not be searched..
-
abstract
approximate_regex
() → Optional[str] Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
-
class
WrappedSearcher
(searcher: refex.search.AbstractSearcher) Forwards everything to a wrapped searcher.
Subclasses can override methods to intercept and manipulate calls. By default, calls are forwarded to
searcher
.-
searcher
the wrapped searcher.
-
parse
(*args, **kwargs)¶ Parses the data into a representation usable by the searcher.
-
find_iter_parsed
(*args, **kwargs)¶ Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
check_is_included
(*args, **kwargs)¶ Raises SkipFileError if a path should not be searched..
-
approximate_regex
()¶ Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
-
class
PragmaSuppressedSearcher
(searcher: refex.search.AbstractSearcher) Automatically suppresses Substitutions based on pragmas in the file.
-
find_iter_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution]¶ Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
-
class
AlsoRegexpSearcher
(searcher: refex.search.AbstractSearcher, also=(), also_not=()) Only yields any results if additional regexes are satisfied.
If the provided regexes don’t match the file when they are supposed to, the file will not be considered further.
-
parse
(data, path) Parses the data into a representation usable by the searcher.
-
-
class
CombinedSearcher
(searchers) Searcher which combines the results of multiple sub-searchers.
Note: all searchers must share compatible
~parsed_file.ParsedFile
types. See theparse()
docstring for requirements.-
parse
(data: str, filename: str) Parses using each sub-searcher, returning the most specific parsed file.
Here “Most Specific” means the most specific subclass.
This places strong requirements on the searchers:
values returned by one
parse()
method should always be usable in place of the value returned by another, if they return the same type, or if the type of the first is a subclass of the type of the other.Ideally, for performance, values should be cached.
- Parameters
data – The data to be parsed.
filename – The name of the file.
- Returns
The merged / most specific parsed file.
-
check_is_included
(*args, **kwargs) Only includes a file if all sub-searchers include it.
-
approximate_regex
() Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
find_iter_parsed
(parsed) Returns all disjoint substitutions for parsed, in sorted order.
-
-
class
FileRegexFilteredSearcher
Base class for classes that filter files based on a regex.
Instances should have an immutable
include_regex
attribute. Only files with paths matching the that regular expression will pass the check_is_included check.If other classes are mixed in which define a
check_is_included
method, this takes the conjunction, and only matches the filename if the other classes agree.-
include_regex
= ''¶ Regex that must match the path name.
-
check_is_included
(path: str) → None¶ Raises SkipFileError if a path should not be searched..
-
-
ROOT_LABEL
= '__root' The special metavariable for the root of the match.
-
MESSAGE_LABEL
= '__message'¶ The special metavariable for
Substitution.message`
-
URL_LABEL
= '__url'¶ The special metavariable for
Substitution.url`
-
CATEGORY_LABEL
= '__category'¶ The special metavariable for
Substitution.category`
-
SIGNIFICANT_LABEL
= '__significant'¶ The special metavariable for
Substitution.significant`
This is a bit of a hack to allow significance to be represented as a substitution.
TODO: remove this in favor of a richer SubstitutionTemplate type.
-
class
BaseRewritingSearcher
A base class for matchers which rewrite via templates.
This is the normal case, and almost all searchers should be written as a :class`BaseRewritingSearcher`.
The templates map matched spans to a template for the replacement. Every match must have a single root label defining the overall match, keyed by
ROOT_LABEL
.For example, to replace the entire match with the empty string, equivalent to
--sub=''
on the command line, one might use:{ROOT_LABEL: formatting.ShTemplate('')}
Whereas to only replace the ‘a’ span with the empty string, but leave the remainder untouched, like
--named-sub=a=''
, one would instead use:{'a': formatting.ShTemplate('')}
-
abstract
find_dicts_parsed
(parsed: refex.parsed_file.ParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]] Finds all match/replacement pairs, as an iterable of pairs of dicts.
- Parameters
parsed – the return value of a call to parse()
- Returns
An iterable of
(matches, replacements)
.matches
maps labels to Span objects,replacements
maps labels to templates.ROOT_LABEL
must be included in everymatches
dict.
-
key_span_for_dict
(parsed: refex.parsed_file.ParsedFile, match_dict: Mapping[Union[str, int], refex.match.Match]) → Optional[Tuple[int, int]] Returns the
key_span
that the finalSubstitution
will have.
-
find_iter_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[refex.substitution.Substitution] Finds all matches as an iterable of Substitutions.
- Parameters
parsed – The parsed data, as returned by
parse()
.- Returns
An iterable of
Substitution
objects.- Raises
SkipFileError – This file was skipped and not searched at all.
-
abstract
-
class
RegexSearcher
(templates: Dict[str, refex.formatting.Template], compiled) Searcher class using regular expressions.
- Parameters
compiled – A compiled regex.
-
find_dicts_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]]¶ Finds all match/replacement pairs, as an iterable of pairs of dicts.
- Parameters
parsed – the return value of a call to parse()
- Returns
An iterable of
(matches, replacements)
.matches
maps labels to Span objects,replacements
maps labels to templates.ROOT_LABEL
must be included in everymatches
dict.
-
approximate_regex
() → str¶ Returns a regular expression that approximates the searcher (or
None
).Any file that would contain a match MUST be matched by the returned regex. If no useful regex exists with that property (e.g. no regex except
.*
would suffice), then it is better to returnNone
.- Returns
Either a regex that matches a file if the search would find a match, or None if the regex would have a very large number of false positives.
The regex is a Python regex in “search” form (i.e. it does not need to match the entire file).
-
class
BasePythonSearcher
Python searcher base class which defines parsing logic.
-
parse
(data: str, filename: str) Returns a
refex.python.matcher.PythonParsedFile
.
-
approximate_regex
() Returns
None
(no approximation).
-
-
class
BasePythonRewritingSearcher
(matcher: refex.python.matcher.Matcher) Searcher class using :mod``refex.python.matchers``.
-
classmethod
from_matcher
(matcher, templates: Dict[str, refex.formatting.Template]) Creates a searcher from an evaluated matcher, and adds a root label.
-
find_dicts_parsed
(parsed: refex.python.matcher.PythonParsedFile) → Iterable[Tuple[Mapping[Union[str, int], refex.match.Match], Mapping[Union[str, int], refex.formatting.Template]]] Finds all match/replacement pairs, as an iterable of pairs of dicts.
- Parameters
parsed – the return value of a call to parse()
- Returns
An iterable of
(matches, replacements)
.matches
maps labels to Span objects,replacements
maps labels to templates.ROOT_LABEL
must be included in everymatches
dict.
-
key_span_for_dict
(parsed: refex.python.matcher.PythonParsedFile, match_dict: Dict[str, refex.match.Match]) Returns a grouping span for the containing simple AST node.
Substitutions that lie within a simple statement or expression are grouped together and mapped to the span of the largest simple node they are a part of. Every other substitution is mapped to None.
The idea here is that we want easy bite-sized chunks that are useful for quickly checking parseability, and for re-running the fixers over that chunk. Simple statements like import and return, as well as expressions that are part of larger statements, are perfect for this.
- Parameters
parsed – The ParsedFile for the same file.
match_dict – The match dict.
- Returns
A grouping key, or None.
-
classmethod
-
class
PyMatcherRewritingSearcher
(matcher: refex.python.matcher.Matcher) Parses the pattern as a
--mode=py
matcher.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyMatcherRewritingSearcher Creates a searcher from a
--mode=py
matcher.
-
classmethod
-
class
PyExprRewritingSearcher
(matcher: refex.python.matcher.Matcher) Parses the pattern as a
--mode=py.expr
template.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyExprRewritingSearcher Creates a searcher from a
--mode=py.expr
template.
-
classmethod
-
class
PyStmtRewritingSearcher
(matcher: refex.python.matcher.Matcher) Parses the pattern as a
--mode=py.stmt
template.-
classmethod
from_pattern
(pattern: str, templates: Optional[Dict[str, refex.formatting.Template]]) → refex.search.PyStmtRewritingSearcher Creates a searcher from a
--mode=py.stmt
template.
-
classmethod
-
rewrite_string
(searcher: refex.search.AbstractSearcher, source: str, path: str, max_iterations=1) → str Applies any replacements to the input source, and returns the result.