Pycaret setup function arguments

PyCaret Official: Home --PyCaret PyCaret Guide: PyCaret Guide --PyCaret PyCaret Github：pycaret/pycaret: An open source, low-code machine learning library in Python

Purpose
Since there are many arguments of the setup function, I checked and translated (DeepL Translator).
Prompt to see what you can do with the setup function.

Classification classification PyCaret2.0

Parameters	Description	Details
data	{array-like, sparse matrix}	Shape (n_samples, n_features)Where n_samples is the number of samples, n_features is the number of features.
target	string	The column name to be passed as a character string. The target variable can be binary or multiclass. For multiclass targets, all estimators are wrapped in the OneVs Rest classifier.
train_size	float, default = 0.7	Training set size. By default, 70 of the data%Is used for training and validation. The rest of the data is tested/Used for holdout sets.
sampling	bool, default = True	Sample size is 25,Beyond 000 samples, pycaret builds base estimators of various sample sizes from the original dataset. It returns performance plots of AUC, Accuracy, Recall, Precision, Kappa, and F1 values at various sample levels to help determine a suitable sample size for modeling. You must then enter the desired sample size for training and validation in a piecaret environment. Input sample_finalize if size is less than 1_model()Remaining dataset (1) only when is called-Sample) is used to fit the model.
sample_estimator	object, default = None	If None, logistic regression is used by default.
categorical_features	string, default = None	Categorical if the inferred data type is incorrect_You can use features to override the inferred type. When running the setup'column1'Use this parameter to categorical if the type of is inferred to be numeric instead of algorithmic_features = ['column1']You can override that type by passing.
categorical_imputation	string, default = 'constant'	If a missing value is found in the category feature, a certain "not" is found._Entered with the "available" value. Another available option is'mode'In, enter the missing value using the most frequent value in the training dataset.
ordinal_features	dictionary, default = None	Ordinal if the data contains ordinal features_You have to do different encoding with the features parameter. The data is'low'、'medium'、'high'Has a categorical variable with the value of, low< medium <ordinal if known to be high_features = { 'column_name' : ['low', 'medium', 'high'] }Can be passed as. The order of the list should be from lowest to highest.
high_cardinality_features	string, default = None	If a feature with high cardinality is included, it can be compressed to a smaller level by passing it as a list of column names with high cardinality.
high_cardinality_method	string, default = 'frequency'	Frequency'frequency'When set to, the original value of the feature is replaced with a frequency distribution and quantified. Another available method is "clustering", which clusters the statistical attributes of the data and replaces the original value of the feature with a cluster label.
numeric_features	string, default = None	If the inferred data type is incorrect, numeric_You can use features to override the inferred type. When running the setup'column1'If the type of is inferred as a category rather than a number, use this parameter to numeric_features = ['column1']It can be overwritten by passing.
numeric_imputation	string, default = 'mean'	If a missing value is found in the numerical features, the average value of the features is used for input. Another available option is'median'In, enter the value using the median of the training dataset.
date_features	string, default = None	If your data has a DateTime column that is not auto-discovered during setup_features = 'date_column_name'You can use this parameter by passing. It can work with multiple date columns. Date columns are not used in modeling. Instead, a feature extraction is performed and the date column is removed from the dataset. If the date column contains a timestamp, time-related features are also extracted.
ignore_features	string, default = None	Param ignore if there are features that should be ignored for modeling_You can pass it to features. The ID and DateTime columns when inferred are automatically set to ignore for modeling purposes.
normalize	bool, default = False	If set to True, the parameter normalized_The feature space is transformed using method. In general, linear algorithms perform better with normalized data, but the results may vary.
normalize_method	string, default = 'zscore'	Defines the method used for normalization. By default, the normalization method is'zscore'Is set to. Standard zscore is z= (x - u) /Calculated as s.
	minmax	0 -Scale and transform each feature individually so that it is within the range of 1.
	maxabs	The maximum absolute value of each feature is 1.Each feature is scaled and converted individually so that it becomes 0. This is a data shift/Since it does not center, it does not destroy the spparity.
	robust	Each feature is scaled and converted according to the range between the quark tiles. Robust scalers often give better results when the dataset contains outliers.
transformation	bool, default = False	When set to True, the power conversion is applied to make the data look more regular Gaussian. This is useful for modeling problems related to heterogeneity and other situations where normality is desired. The optimal parameters for stabilizing the variance and minimizing the skewness are estimated by the maximum likelihood method.
transformation_method	string, default = 'yeo-johnson'	Defines the conversion method. By default, the conversion method is'yeo-johnson'Is set to. As another option'quantile'There is a conversion. Both transformations transform the feature set to follow a Gaussian or normal distribution. Note that the quantile transformation is non-linear and can distort the linear correlation between variables measured on the same scale.
handle_unknown_categorical	bool, default = True	New if set to True/The unknown category level of unseen data is replaced with the most frequent or least frequent level trained in the training data. This method is unknown_categorical_It is defined by the method parameter.
unknown_categorical_method	string, default = 'least_frequent'	A method used to replace an unknown categorical level of invisible data. The method is'least_frequent'Or'most_frequent'Can be set to.
pca	bool, default = False	If set to True, pca_Dimensionality reduction is applied to project the data into lower dimensional space using the method defined by the method parameter. In supervised learning, pca is generally executed when dealing with high feature spaces or when memory is constrained. Note that not all datasets can be efficiently decomposed using linear PCA techniques, and applying PCA can result in information loss. Therefore, different pca to assess its impact_We recommend that you perform multiple experiments using methods.
pca_method	string, default = 'linear'	The linear method uses singular value decomposition to perform linear dimensionality reduction. Other available options are:
	kernel	Dimensionality reduction using RVF kernel.
	incremental	If the data set you want to decompose is too large to fit in memory'linear'Replace pca.
pca_components	int/float, default = 0.99	pca_If the components are float, they are treated as a target percentage to retain the information. pca_If components are integers, they are treated as the number of features to be retained. pca_The components must be strictly less than the original features of the dataset.
ignore_low_variance	bool, default = False	When set to True, all category features with non-statistically significant variances are removed from the dataset. The variance is calculated using the ratio of unique values to the number of samples and the ratio of the most common values to the frequency of the second highest value.
combine_rare_levels	bool, default = False	If set to True, param rare_level_All levels of categorical features below the threshold defined by threshold are combined as one level. There must be at least two levels below the threshold for this to take effect. rare_level_threshold represents the percentile distribution of level frequency. In general, this method is applied to limit sparse matrices due to the large number of levels in categorical features.
rare_level_threshold	float, default = 0.1	Percentile distribution where rare categories are combined. combine_rare_Only enabled if levels is set to True.
bin_numeric_features	list, default = None	When a list of numerical features is passed, they are converted to categorical features using KMeans. The number of clusters is'sturges'It is determined based on the law. This is only best for Gaussian data and underestimates the number of bins for large non-Gaussian datasets.
remove_outliers	bool, default = False	When set to True, outliers are removed from the training data using PCA linear dimensionality reduction using singular value decomposition techniques.
outliers_threshold	float, default = 0.05	Percentage of outliers in the dataset/Percentage is the parameter outliers_It can be defined using threshold. By default, 0.05 is used. This is 0 for the value of each side of the tail of the distribution.It means that 025 is deleted from the training data.
remove_multicollinearity	bool, default = False	When set to True, multicollinearity_Variables with cross-correlation higher than the threshold defined by the threshold parameter are removed. If the two features have a high correlation with each other, the feature with the lowest correlation with the target variable is deleted.
multicollinearity_threshold	float, default = 0.9	The threshold used to remove the correlated features. remove remove_Only enabled if multicollinearity is set to True.
remove_perfect_collinearity	bool, default = False	When set to True, perfect collinearity(correlation=Feature of 1)Is removed from the dataset and the two features are 100%correlationしている場合、そのうちの1つがランダムにデータセットから削除されます。
create_clusters	bool, default = False	When set to True, it creates additional features where each instance is assigned to a cluster. The number of clusters is Calinski-Determined using a combination of Harabasz and Silhouette criteria.
cluster_iter	int, default = 20	The number of iterations used to create the cluster. Each iteration represents the size of the cluster. create create_Only valid if the clusters parameter is set to True.
polynomial_features	bool, default = False	When set to True, polynomial exists in the numerical features in the dataset_New features are created based on the combination of all polynomials up to the degree defined by degree param.
polynomial_degree	int, default = 2	Degree of polynomial features. For example, the input sample is two-dimensional[a, b]In the form of=The polynomial features of 2 are as follows. 1, a, b, a^2, ab, b^2]It becomes.
trigonometry_features	bool, default = False	If set to True, the polynomial that exists in the numerical features in the dataset_A new feature is created based on the combination of all trigonometric functions up to the order defined by the degree parameter.
polynomial_threshold	float, default = 0.1	The dataset holds polynomial and trigonometric features that fall within the defined threshold percentile of feature importance based on a combination of random forest, AdaBoost, and linear correlation. The remaining features are deleted before further processing is performed.
group_features	list or list of list, default = None	Group if it contains features that have features related to the dataset_The features parameter can be used for statistical feature extraction. For example, numerical features in which datasets are related to each other ('Col1', 'Col2', 'Col3')If you have group_By passing a list containing column names under features, you can extract statistics such as mean, median, mode, and standard deviation.
group_names	list, default = None	group_When features are passed, group as a list containing strings_You can pass the group name in the names parameter. group_The length of the list of names is group_Must be equal to the length of features. If the lengths don't match or the name isn't passed, group_1, group_New features are named in sequence, such as 2.
feature_selection	bool, default = False	When set to True, a subset of features are selected using a combination of different sort importance techniques, such as Random Forest, Adaboost, and linear correlation with target variables. The size of the subset is feature_selection_It depends on param. This is commonly used to constrain the feature space to improve modeling efficiency. polynomial_features and features_When using interaction, feature_selection_It is highly recommended to define the threshold parameter with a lower value.
feature_selection_threshold	float, default = 0.8	Threshold used for feature selection (including newly created polynomial features). The higher the value, the larger the feature space. Features of different values, especially when polynomial features and inter-feature interactions are used_selection_It is recommended to make multiple trials using the threshold. Setting a very low value may be efficient, but it can result in underfitting.
feature_interaction	bool, default = False	When set to True, interacts (a) with all numeric variables in the dataset, including polynomial and trigonometric features (if created).*b) Create a new feature by doing. This feature is not scalable and may not work as expected on datasets with large feature spaces.
feature_ratio	bool, default = False	When set to True, the ratio of all numeric variables in the dataset (a/b) Calculate to create a new feature. This feature is not scalable and may not work as expected on datasets with large feature spaces.
interaction_threshold	bool, default = 0.01	polynomial_Similar to threshold, it is used to compress a newly created sparse matrix of features by interaction. Features whose importance based on a combination of Random Forest, AdaBoost, and Linear Correlation falls within the defined threshold percentiles are stored in the dataset. The remaining features are deleted before further processing.
fix_imbalance	bool, default = False	If the dataset has an uneven distribution of target classes, fix_It can be modified using the imbalance parameter. When set to True, SMOTE by default(Synthetic Minority Over-sampling Technique)Is applied to create a composite data point for the minority class.
fix_imbalance_method	obj, default = None	fix_Set imbalance to True and fix_imbalance_If method is set to None, by default it will oversample minority classes during cross-validation.'smote'Is applied. This parameter is'fit_resample'Supports methods'imblearn'Any module can be accepted.
data_split_shuffle	bool, default = True	Set to False to prevent rows from being shuffled when splitting the data.
folds_shuffle	bool, default = False	Set to False to prevent rows from being shuffled when using cross-validation.
n_jobs	int, default = -1	Specifies the number of jobs to run in parallel(For functions that support parallel processing)-1 means to use all processors. To execute all functions on a single processor, n_Set jobs to None.
html	bool, default = True	Set to False to disable the run-time display of the monitor. If you are using an environment that does not support HTML, you need to set it to False.
session_id	int, default = None	If None, a random seed will be generated and returned to the Information grid. A unique number is then distributed as a seed for all functions used during the experiment. This can be used for reproducibility after the entire experiment.
log_experiment	bool, default = False	When set to True, all metrics and parameters are recorded on the MLFlow server.
experiment_name	str, default = None	The name of the experiment to log. If set to None, by default'clf'Is used as an alias for the experiment name.
log_plots	bool, default = False	When set to True, records a particular plot as a png file in MLflow. The default is set to False.
log_profile	bool, default = False	If set to True, the data profile will also be recorded in MLflow as an html file. The default is set to False.
log_data	bool, default = False	When set to True, training and test data will be recorded as csv.
silent	bool, default = False	If set to True, no data type confirmation is required. All preprocessing is performed assuming an automatically inferred data type. Direct use outside of established pipelines is not recommended.
verbose	Boolean, default = True	If verbose is set to False, the information grid will not be printed.
profile	bool, default = False	When set to true, the data profile for exploratory data analysis is displayed in an interactive HTML report.

Regression regression PyCaret2.0

Parameters	Description	Details
data	{array-like, sparse matrix}	Shape (n_samples, n_features)Where n_samples is the number of samples, n_features is the number of features.
target	string	The column name to be passed as a character string.
train_size	float, default = 0.7	Training set size. By default, 70 of the data%Is used for training and validation. The rest of the data is tested/Used for holdout sets.
sampling	bool, default = True	Sample size is 25,Beyond 000 samples, pycaret builds base estimators of various sample sizes from the original dataset. This returns performance plots of R2 values at different sample levels to help determine a suitable sample size for modeling. Next, you need to enter the desired sample size for training and validation in the pycaret environment. Input sample_finalize if size is less than 1_model()Remaining dataset (1) only when is called-sample) is used to fit the model.
sample_estimator	object, default = None	If none, linear regression is used by default.
categorical_features	string, default = None	Categorical if the inferred data type is incorrect_You can use features to override the inferred type. When running the setup'column1'Use this parameter to categorical if the type of is inferred to be numeric instead of algorithmic_features = ['column1']You can override that type by passing.
categorical_imputation	string, default = 'constant'	If a missing value is found in the category feature, a certain "not" is found._Entered with the "available" value. Another available option is'mode'In, enter the missing value using the most frequent value in the training dataset.
ordinal_features	dictionary, default = None	Ordinal if the data contains ordinal features_You have to do different encoding with the features parameter. The data is'low'、'medium'、'high'Has a categorical variable with the value of, low< medium <ordinal if known to be high_features = { 'column_name' : ['low', 'medium', 'high'] }Can be passed as. The order of the list should be from lowest to highest.
high_cardinality_features	string, default = None	If your data contains high cardinality features, you can compress them to a lower level by passing them as a list of high cardinality column names. Feature compression is param high_cardinality_It is done by the method defined in method.
high_cardinality_method	string, default = 'frequency'	Frequency'frequency'When set to, the original value of the feature is replaced with a frequency distribution and quantified. Another available method is "clustering", which clusters the statistical attributes of the data and replaces the original value of the feature with a cluster label.
numeric_features	string, default = None	If the inferred data type is incorrect, numeric_You can use features to override the inferred type. When running the setup'column1'If the type of is inferred as a category rather than a number, use this parameter to numeric_features = ['column1']It can be overwritten by passing.
numeric_imputation	string, default = 'mean'	If a missing value is found in the numerical features, the average value of the features is used for input. Another available option is'median'In, enter the value using the median of the training dataset.
date_features	string, default = None	If your data has a DateTime column that is not auto-discovered during setup_features = 'date_column_name'You can use this parameter by passing. It can work with multiple date columns. Date columns are not used in modeling. Instead, a feature extraction is performed and the date column is removed from the dataset. If the date column contains a timestamp, time-related features are also extracted.
ignore_features	string, default = None	Param ignore if there are features that should be ignored for modeling_You can pass it to features. The ID and DateTime columns when inferred are automatically set to ignore for modeling purposes.
normalize	bool, default = False	If set to True, the parameter normalized_The feature space is transformed using method. In general, linear algorithms perform better with normalized data, but the results may vary.
normalize_method	string, default = 'zscore'	Defines the method used for normalization. By default, the normalization method is'zscore'Is set to. Standard zscore is z= (x - u) /Calculated as s.
	minmax	minmax' : 0 -Scale and transform each feature individually so that it is within the range of 1.
	maxabs	maxabs':The maximum absolute value of each feature is 1.Each feature is scaled and converted individually so that it becomes 0. This is a data shift/Since it does not center, it does not destroy the spparity.
	robust	robust':Each feature is scaled and converted according to the range between the quark tiles. Robust scalers often give better results when the dataset contains outliers.
transformation	bool, default = False	Set to True to make the data more normal/A multiplier conversion is applied to make it look like Gauss. This is useful for modeling problems related to heterogeneity and other situations where normality is desired. The optimal parameters for stabilizing the variance and minimizing the skewness are estimated by the maximum likelihood method.
transformation_method	string, default = 'yeo-johnson'	Defines the conversion method. By default, the conversion method is'yeo-johnson'Is set to. As another option'quantile'There is a conversion. Both transformations transform the feature set to follow a Gaussian or normal distribution. Note that the quantile transformation is non-linear and can distort the linear correlation between variables measured on the same scale.
handle_unknown_categorical	bool, default = True	New if set to True/The unknown category level of unseen data is replaced with the most frequent or least frequent level trained in the training data. This method is unknown_categorical_It is defined by the method parameter.
unknown_categorical_method	string, default = 'least_frequent'	A method used to replace an unknown categorical level of invisible data. In the method'least_frequent'Or'most_frequent'Can be set.
pca	bool, default = False	If set to True, pca_Dimensionality reduction is applied to project the data into lower dimensional space using the method defined by the method parameter. In supervised learning, pca is generally executed when dealing with high feature spaces or when memory is constrained. Note that not all datasets can be efficiently decomposed using linear PCA techniques, and applying PCA can result in information loss. Therefore, different pca to assess its impact_We recommend that you perform multiple experiments using methods.
pca_method	string, default = 'linear'	The linear method uses singular value decomposition to perform linear dimensionality reduction. Other available options are:
	kernel	Dimensionality reduction using RVF kernel.
	incremental	If the data set you want to decompose is too large to fit in memory'linear'Replace pca.
pca_components	int/float, default = 0.99	pca_If the components are float, they are treated as a target percentage to retain the information. pca_If components are integers, they are treated as the number of features to be retained. pca_The components must be strictly less than the original features of the dataset.
ignore_low_variance	bool, default = False	When set to True, all category features with non-statistically significant variances are removed from the dataset. The variance is calculated using the ratio of unique values to the number of samples and the ratio of the most common values to the frequency of the second highest value.
combine_rare_levels	bool, default = False	If set to True, param rare_level_All levels of categorical features below the threshold defined by threshold are combined as one level. There must be at least two levels below the threshold for this to take effect. rare_level_threshold represents the percentile distribution of level frequency. In general, this method is applied to limit sparse matrices due to the large number of levels in categorical features.
rare_level_threshold	float, default = 0.1	Percentile distribution where rare categories are combined. combine_rare_Only enabled if levels is set to True.
bin_numeric_features	list, default = None	When a list of numerical features is passed, they are converted to categorical features using KMeans. The number of clusters is'sturges'It is determined based on the law. This is only best for Gaussian data and underestimates the number of bins for large non-Gaussian datasets.
remove_outliers	bool, default = False	When set to True, outliers are removed from the training data using PCA linear dimensionality reduction using singular value decomposition techniques.
outliers_threshold	float, default = 0.05	Percentage of outliers in the dataset/Percentage is the parameter outliers_It can be defined using threshold. By default, 0.05 is used. This is 0 for the value of each side of the tail of the distribution.It means that 025 is deleted from the training data.
remove_multicollinearity	bool, default = False	When set to True, multicollinearity_Variables with cross-correlation higher than the threshold defined by the threshold parameter are removed. If the two features have a high correlation with each other, the feature with the lowest correlation with the target variable is deleted.
multicollinearity_threshold	float, default = 0.9	The threshold used to remove the correlated features. remove remove_Only enabled if multicollinearity is set to True.
remove_perfect_collinearity	bool, default = False	When set to True, perfect collinearity(correlation=Feature of 1)Is removed from the dataset and the two features are 100%correlationしている場合、そのうちの1つがランダムにデータセットから削除されます。
create_clusters	bool, default = False	When set to True, it creates additional features where each instance is assigned to a cluster. The number of clusters is Calinski-Determined using a combination of Harabasz and Silhouette criteria.
cluster_iter	int, default = 20	The number of iterations used to create the cluster. Each iteration represents the size of the cluster. create create_Only valid if the clusters parameter is set to True.
polynomial_features	bool, default = False	When set to True, polynomial exists in the numerical features in the dataset_New features are created based on the combination of all polynomials up to the degree defined by degree param.
polynomial_degree	int, default = 2	Degree of polynomial features. For example, the input sample is two-dimensional[a, b]In the form of=The polynomial features of 2 are as follows. 1, a, b, a^2, ab, b^2]It becomes.
trigonometry_features	bool, default = False	If set to True, the polynomial that exists in the numerical features in the dataset_A new feature is created based on the combination of all trigonometric functions up to the order defined by the degree parameter.
polynomial_threshold	float, default = 0.1	It is used to compress sparse matrices of polynomial features and trigonometric features. Polynomial and trigonometric features whose importance of features based on a combination of random forest, AdaBoost, and linear correlation fall within the defined threshold percentiles are retained in the dataset. The remaining features are deleted before further processing.
group_features	list or list of list, default = None	Group if it contains features that have features related to the dataset_featuresparam can be used for statistical feature extraction. For example, numerical features in which datasets are related to each other ('Col1', 'Col2', 'Col3')If you have group_By passing a list containing column names under features, you can extract statistics such as mean, median, mode, and standard deviation.
group_names	list, default = None	group_When features are passed, group as a list containing strings_You can pass the group name in the names parameter. group_The length of the list of names is group_Must be equal to the length of features. If the lengths don't match or the name isn't passed, group_1, group_New features are named in sequence, such as 2.
feature_selection	bool, default = False	When set to True, a subset of features are selected using a combination of different sort importance techniques, such as Random Forest, Adaboost, and linear correlation with target variables. The size of the subset is feature_selection_It depends on param. This is commonly used to constrain the feature space to improve modeling efficiency. polynomial_features and features_When using interaction, feature_selection_It is highly recommended to define the threshold parameter with a lower value.
feature_selection_threshold	float, default = 0.8	Threshold used for feature selection (including newly created polynomial features). The larger the value, the more features. Features with different values, especially when using polynomial features and inter-feature interactions_selection_We recommend that you use the threshold to make multiple attempts. Setting a very low value is efficient, but can result in underfitting.
feature_interaction	bool, default = False	When set to True, interacts (a) with all numeric variables in the dataset, including polynomial and trigonometric features (if created).*b) Create a new feature by doing. This feature is not scalable and may not work as expected on datasets with large feature spaces.
feature_ratio	bool, default = False	When set to True, the ratio of all numeric variables in the dataset (a/b) Calculate to create a new feature. This feature is not scalable and may not work as expected on datasets with large feature spaces.
interaction_threshold	bool, default = 0.01	polynomial_Similar to threshold, it is used to compress a newly created sparse matrix of features by interaction. Features whose importance based on a combination of Random Forest, AdaBoost, and Linear Correlation falls within the defined threshold percentiles are stored in the dataset. The remaining features are deleted before further processing.
transform_target	bool, default = False	When set to True, transform_target_Converts the target variable as defined by the method parameter. Target transformation is applied separately from feature transformation.
transform_target_method	string, default = 'box-cox'	Box-cox'and'yeo-johnson'The law is supported. Box-Cox requires the input data to be exactly positive, but Yeo-Johnson supports both positive and negative data. transform_target_method is'box-cox'And if the target variable contains a negative value, the method internally to avoid exceptions'yeo-johnson'Forced to.
data_split_shuffle	bool, default = True	Set to False to prevent rows from being shuffled when splitting the data.
folds_shuffle	bool, default = True	Set to False to prevent rows from being shuffled when using cross-validation.
n_jobs	int, default = -1	Specifies the number of jobs to run in parallel(For functions that support parallel processing)-1 means to use all processors. To execute all functions on a single processor, n_Set jobs to None.
html	bool, default = True	Set to False to disable the run-time display of the monitor. If you are using an environment that does not support HTML, you need to set it to False.
session_id	int, default = None	If None, a random seed will be generated and returned to the Information grid. A unique number is then distributed as a seed for all functions used during the experiment. This can be used for reproducibility after the entire experiment.
log_experiment	bool, default = False	When set to True, all metrics and parameters are recorded on the MLFlow server.
experiment_name	str, default = None	The name of the experiment to log. If set to None, by default'reg'Is used as an alias for the experiment name.
log_plots	bool, default = False	When set to True, records a particular plot as a png file in MLflow. The default is set to False.
log_profile	bool, default = False	If set to True, the data profile will also be recorded in MLflow as an html file. The default is set to False.
log_data	bool, default = False	When set to True, training and test data will be recorded as csv.
silent	bool, default = False	If set to True, no data type confirmation is required. All preprocessing is performed assuming an automatically inferred data type. Direct use outside of established pipelines is not recommended.
verbose	Boolean, default = True	If verbose is set to False, the information grid will not be printed.
profile	bool, default = False	When set to true, the data profile for exploratory data analysis is displayed in an interactive HTML report.

About the arguments of the setup function of PyCaret

Pycaret setup function arguments

Classification classification PyCaret2.0

Regression regression PyCaret2.0