Quantum|Refinement: final-stage refinement with restraints from Quantum Chemistry.

Authors

Min Zheng, Pavel Afonine, Mark Waller, Nigel Moriarty

Purpose

qr.refine is a command line tool for refining bio-macromolecules using restraints from Quantum Chemistry (QM).

Usage

Q|R is a new open-source module that carries out refinement of bio-macromolecules. To maintain a small and agile code-base, qr is built on top of cctbx and Terachem. The cctbx library provides most of the routines needed for x-ray refinement. The key feature of the qr code is that it interfaces to Terachem to obtain chemical restraints using ab initio methods.

In principle, qr.refine only needs a data file (e.g. mtz) and a model (e.g. pdb):

qr.refine input.pdb input.mtz

Sensible default options are selected.

QM interface

List of QM interfaces

Literature

https://journals.iucr.org/d/issues/2017/01/00/lp5021/lp5021.pdf

https://journals.iucr.org/d/issues/2017/12/00/lp5024/lp5024.pdf

List of all available keywords

  • max_atoms = 15000 maximum number of atoms
  • debug = False
  • restraints = cctbx *qm
  • output_file_name_prefix = None
  • output_folder_name = "pdb"
  • shared_disk = True
  • rst_file = None Restart file to use for determining location in run. Loads previous results of weight calculations.
  • dump_gradients = None
  • input
    • sequence = None
    • scattering_table = wk1995 it1992 *n_gaussian neutron electron
    • wavelength = None
    • energy = None
    • twin_law = Auto Enter twin law if known.
    • xray_dataScope of X-ray data and free-R flags
      • file_name = None
      • labels = None
      • high_resolution = None
      • low_resolution = None
      • outliers_rejection = True Remove basic wilson outliers , extreme wilson outliers , and beamstop shadow outliers
      • french_wilson_scale = True
      • sigma_fobs_rejection_criterion = None
      • sigma_iobs_rejection_criterion = None
      • ignore_all_zeros = True
      • force_anomalous_flag_to_be_equal_to = None
      • convert_to_non_anomalous_if_ratio_pairs_lone_less_than_threshold = 0.5
      • french_wilson
        • max_bins = 60 Maximum number of resolution bins
        • min_bin_size = 40 Minimum number of reflections per bin
      • r_free_flags
        • file_name = None This is normally the same as the file containing Fobs and is usually selected automatically.
        • label = None
        • test_flag_value = None This value is usually selected automatically - do not change unless you really know what you're doing!
        • ignore_r_free_flags = False Use all reflections in refinement (work and test)
        • disable_suitability_test = False
        • ignore_pdb_hexdigest = False If True, disables safety check based on MD5 hexdigests stored in PDB files produced by previous runs.
        • generate = False Generate R-free flags (if not available in input files)
        • fraction = 0.1
        • max_free = 2000
        • lattice_symmetry_max_delta = 5
        • use_lattice_symmetry = True
        • use_dataman_shells = False Used to avoid biasing of the test set by certain types of non-crystallographic symmetry.
        • n_shells = 20
    • pdb
      • file_name = None Model file(s) name (PDB)
    • monomers
      • file_name = None Monomer file(s) name (CIF)
    • maps
      • map_file_name = None A CCP4-formatted map
      • d_min = None Resolution of map
      • map_coefficients_file_name = None MTZ file containing map
      • map_coefficients_label = None Data label for complex map coefficients in MTZ file
  • cluster
    • charge_cutoff = 8.0 distance for point charge cutoff
    • clustering = False enable/disable clustering
    • charge_embedding = False point charge embedding
    • two_buffers = False
    • maxnum_residues_in_cluster = 15 maximum number of residues in a cluster
    • clustering_method = gnc *bcc type of clustering algorithm
    • altloc_method = *average subtract
  • quantum
    • engine_name = *mopac ani torchani terachem turbomole pyscf orca gaussian xtb choose the QM program
    • charge = None The formal charge of the entire molecule
    • basis = Auto pre-defined defaults
    • method = Auto Defaults to HF for all but MOPAC (PM7) and xTB (GFN2)
    • memory = None memory for the QM program
    • nproc = None number of parallel processes for the QM program
    • qm_addon = gcp dftd3 gcp-d3 allows additional calcuations of the gCP and/or DFT-D3 corrections using their stand-alone programs
    • qm_addon_method = None specifies flags for the qm_addon. See manual for details.
  • refine
    • dry_run = False do not perform calculations, only setup steps
    • sf_algorithm = *direct fft
    • refinement_target_name = *ml ls_wunit_k1
    • mode = opt *refine choose between refinement and geometry optimization
    • number_of_macro_cycles = 1
    • number_of_weight_search_cycles = 50
    • number_of_refine_cycles = 5 maximum number of refinement cycles
    • number_of_micro_cycles = 50
    • data_weight = None
    • skip_initial_weight_optimization = False
    • max_iterations = 50
    • line_search = True
    • stpmax = 3
    • gradient_only = False
    • update_all_scales = True
    • refine_sites = True
    • refine_adp = False
    • restraints_weight_scale = 1.0
    • shake_sites = False
    • use_convergence_test = True
    • max_bond_rmsd = 0.03
    • max_r_work_r_free_gap = 5.0
    • r_tolerance = 0.001
    • rmsd_tolerance = 0.01
    • opt_log = False additional output of the L-BFGS optimizer
  • parallel
    • method = *multiprocessing slurm pbs sge lsf threading type of parallel mode and efficient method of processes on the current computer. The others are queueing protocols with the expection of threading which is not a safe choice.
    • nproc = None Number of processes to use
    • qsub_command = None Specific command to use on the queue system