API Reference
This documentation is generated automatically from the package’s docstrings.
- class promptstability.core.PromptStabilityAnalysis(annotation_function, data, metric_fn=<function nominal_metric>, parse_function=None, load_generation_models=True)
Bases:
objectCore prompt-stability estimation class.
The class supports: - repeated-run intra-prompt stability estimation - paraphrase-based inter-prompt stability estimation - post hoc rescoring from saved annotation tables - summary diagnostics for intra- and inter-PSS outputs
- bootstrap_krippendorff(df, annotator_col, bootstrap_samples, confidence_level=95)
Compute Krippendorff’s alpha with bootstrap confidence intervals.
- extract_inter_score_map(annotated_df)
- extract_intra_score_map(annotated_df, analysis_modes=None)
- inter_pss(original_text, prompt_postfix=None, nr_variations=5, temperatures=None, iterations=1, bootstrap_samples=1000, print_prompts=False, edit_prompts_path=None, plot=False, save_path=None, save_csv=None)
Evaluate between-prompt stability across paraphrase temperatures.
- intra_pss(original_text, prompt_postfix=None, iterations=10, bootstrap_samples=1000, analysis_modes=None, plot=False, plot_mode='cumulative_alpha', save_path=None, save_csv=None, return_summaries=False, summary_threshold=0.8, estimate_tolerance=0.01, precision_tolerance=0.02)
Evaluate within-prompt stability via repeated prompt runs.
By default this preserves the original package behavior and returns a cumulative intra-PSS series. When
analysis_modesincludesadjacent_alpha, the method also computes an adjacent-run series that compares runjto runj-1.
- manual_inter_pss(edit_prompts_path, bootstrap_samples=1000, plot=False, save_path=None, save_csv=None)
Evaluate inter-PSS from a manually edited prompt-variation CSV.
- score_intra_annotations(annotated_df, bootstrap_samples=1000, analysis_modes=None)
Recompute intra-PSS metrics from an existing long-format annotation table.
- Parameters:
annotated_df (pandas.DataFrame) – Long-format annotation data with at least
id,annotation, anditerationcolumns.bootstrap_samples (int, optional) – Number of bootstrap samples used for confidence intervals.
analysis_modes (list[str], optional) – Subset of
["cumulative_alpha", "adjacent_alpha"].
- summarize_inter_scores(score_map, threshold=0.8)
- summarize_intra_scores(score_map, threshold=0.8, estimate_tolerance=0.01, precision_tolerance=0.02)
- promptstability.core.get_api_key(api='openai')
Retrieve an API key for the specified service from environment variables.
- Parameters:
api (str, optional) – API service name. Supported values are
openai,mistral,anthropic,cohere, andhuggingface.- Returns:
The API key value.
- Return type:
str
- promptstability.core.load_example_data()
Load example data included with the package.