There are two ways to set up the environment. One simply installs all dependencies to get you up and running. The other uses the transformers fork necessary for the project in editable mode. This setup is recommended if you want to make changes to the transformers library while working on any implementation details.
- Clone this repo
cd semantic_decoding
# build env
conda env create -f env/environment.yml
# conda env create -f env/environment-gpu.yml # use instead for gpu support
conda activate sem
- Clone this repo
- Clone the hf fork to a sibling directory
# the repos should be in the same directory for the yml install to work; otherwise adapt path in yml file
ls
# my_folder/
# semantic_decoding/ # this repo
# transformers/ # the hf fork
- comment out the remote source of transformers in the
environment*.yml
file and point to the the local directory instead
name: sem
channels:
- ...
dependencies:
- ...
- pip
- pip:
- - git+https://github.com/philheller/transformers.git
+ # - git+https://github.com/philheller/transformers.git
- # - -e ../transformers
+ - -e ../transformers
- Install all dependencies (currently only
environment.yml
&environment-gpu.yml
are up to date)
# from the root of this repo
conda env create -f env/environment.yml
# conda env create -f env/environment-gpu.yml # use instead for gpu support
# make sure the pip dependencies in the yml file have properly been installed
For usage, activate the enviroment and see Usage.
conda activate sem
The usage of semantic decoding is provided through the Generator
class. Here is simple usage:
# generator
from semantic_decoding.generators.generator import Generator
# generation config for syntactic and semantic level
from transformers.generation.utils import GenerationConfig
from semantic_decoding.generators.semantic import SemanticGenerationConfig
# load the generator
generator = Generator(
model_name,
"en_core_web_sm",
device,
unique_key=args.aggregation_key
)
# generation configs
# syntactic
syntactic_generation_config: GenerationConfig = GenerationConfig(
max_new_tokens=4,
num_beams=200,
num_return_sequences=200,
access_token=access_token,
# ...
)
# semantic
semantic_generation_config: SemanticGenerationConfig = SemanticGenerationConfig(
num_beams=2,
num_return_sequences=2,
max_overall_tokens=1000,
max_overall_generated_tokens=1000,
nest_beam_search=True,
)
# generate
res = generator.generate(
prompts=["Obama was born in"],
syntactic_generation_config=syntactic_generation_config,
semantic_generation_config=semantic_generation_config,
)
Syntactic models are the HF models.
Available models for the semantic generation can be viewed and implemented in semantic_model.py
. Currently, some ner models and some spacy models are supported. Adding new ones requires implementing the SemanticDataModel
class and registration in the SemanticModelFactory
. The already implemented models serve as examples.
The Generator.generate
function is structurally kept analogous to the transformers
library. Currently, these decoding modes are supported:
- Greedy decoding
- Beam Search decoding
- Nested Beam Search decoding
The appropriate mode is selected based on the semantic and syntactic generation config. For more details, see the SemanticGenerationConfig.
Central to the generation is the Generator
class which orchestrates the generation. Helper functions are mostly for syntactic and semantic generation structure code further:
- the SyntacticGenerator
- the SemanticGenerator
The SyntacticGenerator contains the functions associated with manipulation of syntactic hypothesis. The SemanticGenerator contains the functions associated with manipulation of semantic hypothesis.
Both classes also contain the models and the tokenizers used:
# to decode syntactic tokens
syntactic_generator.tokenizer.batch_decode(syntactic_output)
# to decode semantic tokens
semantic_generator.tokenizer.batch_decode(semantic_output)
- Batching and scores Scores are not resolving to be the same based on batching and masking. This can change the results of beam search (more on that in tests regarding differences in scores). To make a result reproducible (and thus easily accessible), batching should be avoided. Not batched computations can be reproduced.