Difference between revisions of "Summary class Geromics 2024 HyoungJinChoi"

Revision as of 11:12, 10 May 2024

Main Page » UNIST Geromics course » Geromics Course Students Folder 2024 » HyoungJinChoi 2024 Geromics Course » Summary class Geromics 2024 HyoungJinCho

2024.03.06

orientation Geromics

2024.03.08

What is theory?

A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be scientific, belong to a non-scientific discipline, or no discipline at all. Depending on the context, a theory's assertions might, for example, include generalized explanations of how nature works. The word has its roots in ancient Greek, but in modern use it has taken on several related meanings.

In modern science, the term "theory" refers to scientific theories, a well-confirmed type of explanation of nature, made in a way consistent with the scientific method, and fulfilling the criteria required by modern science. Such theories are described in such a way that scientific tests should be able to provide empirical support for it, or empirical contradiction ("falsify") of it. Scientific theories are the most reliable, rigorous, and comprehensive form of scientific knowledge,^[1] in contrast to more common uses of the word "theory" that imply that something is unproven or speculative (which in formal terms is better characterized by the word hypothesis).^[2] Scientific theories are distinguished from hypotheses, which are individual empirically testable conjectures, and from scientific laws, which are descriptive accounts of the way nature behaves under certain conditions.

Theories guide the enterprise of finding facts rather than of reaching goals, and are neutral concerning alternatives among values.^[3]^: 131 A theory can be a body of knowledge, which may or may not be associated with particular explanatory models. To theorize is to develop this body of knowledge.^[4]^: 46

The word theory or "in theory" is sometimes used outside of science to refer to something which the speaker did not experience or test before.^[5] In science, this same concept is referred to as a hypothesis, and the word "hypothetically" is used both inside and outside of science. In its usage outside of science, the word "theory" is very often contrasted to "practice" (from Greek praxis, πρᾶξις) a Greek term for doing, which is opposed to theory.^[6] A "classical example" of the distinction between "theoretical" and "practical" uses the discipline of medicine: medical theory involves trying to understand the causes and nature of health and sickness, while the practical side of medicine is trying to make people healthy. These two things are related but can be independent, because it is possible to research health and sickness without curing specific patients, and it is possible to cure a patient without knowing how the cure worked.^[a]

full text link : https://en.wikipedia.org/wiki/Theory

2024.03.22

--
Prepare class Before you attend this week's lecture, I would like to encourage you to watch the following YouTube video:

Title: Mitochondrial Regulation of Stem Cell Aging
Presenter: Danica Chen, PhD (University of California, Berkeley, USA)
YouTube Link: https://www.youtube.com/watch?v=FoJWmaT1ptM

In this video, Professor Danica Chen discusses various methods to protect mitochondria and reverse stem cell aging by Sirtuins.
It's an insightful presentation that will undoubtedly enrich our understanding of the topic before our lecture.
--

Mitochondrial Stress is a Driver of Stem Cell Aging

Mitochondrial stress increases in stem cell during aging
Mitochondrial dysfunction and aging produces similar defects in stem cells
Stem cells do not age at the same rate; about one third od chronologically aged HSCs exhibit regeberative function similar to healthy young HSCs, coinciding with the health of mitochondria.

Stem cell

In multicellular organisms, stem cells are undifferentiated or partially differentiated cells that can change into various types of cells and proliferate indefinitely to produce more of the same stem cell. They are the earliest type of cell in a cell lineage.^[1] They are found in both embryonic and adult organisms, but they have slightly different properties in each. They are usually distinguished from progenitor cells, which cannot divide indefinitely, and precursor or blast cells, which are usually committed to differentiating into one cell type.

In mammals, roughly 50 to 150 cells make up the inner cell mass during the blastocyst stage of embryonic development, around days 5–14. These have stem-cell capability. In vivo, they eventually differentiate into all of the body's cell types (making them pluripotent). This process starts with the differentiation into the three germ layers – the ectoderm, mesoderm and endoderm – at the gastrulation stage. However, when they are isolated and cultured in vitro, they can be kept in the stem-cell stage and are known as embryonic stem cells (ESCs).

Adult stem cells are found in a few select locations in the body, known as niches, such as those in the bone marrow or gonads. They exist to replenish rapidly lost cell types and are multipotent or unipotent, meaning they only differentiate into a few cell types or one type of cell. In mammals, they include, among others, hematopoietic stem cells, which replenish blood and immune cells, basal cells, which maintain the skin epithelium, and mesenchymal stem cells, which maintain bone, cartilage, muscle and fat cells. Adult stem cells are a small minority of cells; they are vastly outnumbered by the progenitor cells and terminally differentiated cells that they differentiate into.^[1]

Research into stem cells grew out of findings by Canadian biologists Ernest McCulloch, James Till and Andrew J. Becker at the University of Toronto and the Ontario Cancer Institute in the 1960s.^[2]^[3] As of 2016, the only established medical therapy using stem cells is hematopoietic stem cell transplantation,^[4] first performed in 1958 by French oncologist Georges Mathé. Since 1998 however, it has been possible to culture and differentiate human embryonic stem cells (in stem-cell lines). The process of isolating these cells has been controversial, because it typically results in the destruction of the embryo. Sources for isolating ESCs have been restricted in some European countries and Canada, but others such as the UK and China have promoted the research.^[5] Somatic cell nuclear transfer is a cloning method that can be used to create a cloned embryo for the use of its embryonic stem cells in stem cell therapy.^[6] In 2006, a Japanese team led by Shinya Yamanaka discovered a method to convert mature body cells back into stem cells. These were termed induced pluripotent stem cells (iPSCs).^[7]

full txt link : https://en.wikipedia.org/wiki/Stem_cell

How does the total amount of stem cells in humans change over time?
At what age does it reach its maximum and minimum?
When fertilization occurs, one: start life, two 120 years.

When certain data points are plotted, it seems feasible to converge through statistical methods (considering the number of inflection points).
Are there any papers related to the number of stem cells at various ages in a particular sample?
> No results found in the initial search. (Only use 14 min)

2024.03.29

Occupations with high life expectancy ?

full txt link : https://www.hani.co.kr/arti/society/rights/471412.html

2024.04.05

DNA

Deoxyribonucleic acid (/diːˈɒksɪˌraɪboʊnjuːˌkliːɪk, -ˌkleɪ-/ ^ⓘ;^[1] DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.

The two DNA strands are known as polynucleotides as they are composed of simpler monomeric units called nucleotides.^[2]^[3] Each nucleotide is composed of one of four nitrogen-containing nucleobases (cytosine [C], guanine [G], adenine [A] or thymine [T]), a sugar called deoxyribose, and a phosphate group. The nucleotides are joined to one another in a chain by covalent bonds (known as the phosphodiester linkage) between the sugar of one nucleotide and the phosphate of the next, resulting in an alternating sugar-phosphate backbone. The nitrogenous bases of the two separate polynucleotide strands are bound together, according to base pairing rules (A with T and C with G), with hydrogen bonds to make double-stranded DNA. The complementary nitrogenous bases are divided into two groups, the single-ringed pyrimidines and the double-ringed purines. In DNA, the pyrimidines are thymine and cytosine; the purines are adenine and guanine.

Both strands of double-stranded DNA store the same biological information. This information is replicated when the two strands separate. A large part of DNA (more than 98% for humans) is non-coding, meaning that these sections do not serve as patterns for protein sequences. The two strands of DNA run in opposite directions to each other and are thus antiparallel. Attached to each sugar is one of four types of nucleobases (or bases). It is the sequence of these four nucleobases along the backbone that encodes genetic information. RNA strands are created using DNA strands as a template in a process called transcription, where DNA bases are exchanged for their corresponding bases except in the case of thymine (T), for which RNA substitutes uracil (U).^[4] Under the genetic code, these RNA strands specify the sequence of amino acids within proteins in a process called translation.

Within eukaryotic cells, DNA is organized into long structures called chromosomes. Before typical cell division, these chromosomes are duplicated in the process of DNA replication, providing a complete set of chromosomes for each daughter cell. Eukaryotic organisms (animals, plants, fungi and protists) store most of their DNA inside the cell nucleus as nuclear DNA, and some in the mitochondria as mitochondrial DNA or in chloroplasts as chloroplast DNA.^[5] In contrast, prokaryotes (bacteria and archaea) store their DNA only in the cytoplasm, in circular chromosomes. Within eukaryotic chromosomes, chromatin proteins, such as histones, compact and organize DNA. These compacting structures guide the interactions between DNA and other proteins, helping control which parts of the DNA are transcribed.

full text link : https://en.wikipedia.org/wiki/DNA

RNA

Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyribonucleic acid (DNA) are nucleic acids. The nucleic acids constitute one of the four major macromolecules essential for all known forms of life. RNA is assembled as a chain of nucleotides. Cellular organisms use messenger RNA (mRNA) to convey genetic information (using the nitrogenous bases of guanine, uracil, adenine, and cytosine, denoted by the letters G, U, A, and C) that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.

Some RNA molecules play an active role within cells by catalyzing biological reactions, controlling gene expression, or sensing and communicating responses to cellular signals. One of these active processes is protein synthesis, a universal function in which RNA molecules direct the synthesis of proteins on ribosomes. This process uses transfer RNA (tRNA) molecules to deliver amino acids to the ribosome, where ribosomal RNA (rRNA) then links amino acids together to form coded proteins.

It has become widely accepted in science^[1] that early in the history of life on Earth, prior to the evolution of DNA and possibly of protein-based enzymes as well, an "RNA world" existed in which RNA served as both living organisms' storage method for genetic information—a role fulfilled today by DNA, except in the case of RNA viruses—and potentially performed catalytic functions in cells—a function performed today by protein enzymes, with the notable and important exception of the ribosome, which is a ribozyme.

Full text link : https://en.wikipedia.org/wiki/RNA

eQTL

Distant and local, trans- and cis-eQTLs, respectively

An expression quantitative trait is an amount of an mRNA transcript or a protein. These are usually the product of a single gene with a specific chromosomal location. This distinguishes expression quantitative traits from most complex traits, which are not the product of the expression of a single gene. Chromosomal loci that explain variance in expression traits are called eQTLs. eQTLs located near the gene-of-origin (gene which produces the transcript or protein) are referred to as local eQTLs or cis-eQTLs. By contrast, those located distant from their gene of origin, often on different chromosomes, are referred to as distant eQTLs or trans-eQTLs.^[3] ^[4] The first genome-wide study of gene expression was carried out in yeast and published in 2002.^[5] The initial wave of eQTL studies employed microarrays to measure genome-wide gene expression; more recent studies have employed massively parallel RNA sequencing. Many expression QTL studies were performed in plants and animals, including humans,^[6] non-human primates^[7]^[8] and mice.^[9]

Some cis eQTLs are detected in many tissue types but the majority of trans eQTLs are tissue-dependent (dynamic).^[10] eQTLs may act in cis (locally) or trans (at a distance) to a gene.^[11] The abundance of a gene transcript is directly modified by polymorphism in regulatory elements. Consequently, transcript abundance might be considered as a quantitative trait that can be mapped with considerable power. These have been named expression QTLs (eQTLs).^[12] The combination of whole-genome genetic association studies and the measurement of global gene expression allows the systematic identification of eQTLs. By assaying gene expression and genetic variation simultaneously on a genome-wide basis in a large number of individuals, statistical genetic methods can be used to map the genetic factors that underpin individual differences in quantitative levels of expression of many thousands of transcripts.^[13] Studies have shown that single nucleotide polymorphisms (SNPs) reproducibly associated with complex disorders ^[14] as well as certain pharmacologic phenotypes ^[15] are found to be significantly enriched for eQTLs, relative to frequency-matched control SNPs. The integration of eQTLs with GWAS has led to development of the transcriptome-wide association study (TWAS) methodology.^[16]^[17]

Detecting eQTLs

Mapping eQTLs is done using standard QTL mapping methods that test the linkage between variation in expression and genetic polymorphisms. The only considerable difference is that eQTL studies can involve a million or more expression microtraits. Standard gene mapping software packages can be used, although it is often faster to use custom code such as QTL Reaper or the web-based eQTL mapping system GeneNetwork. GeneNetwork hosts many large eQTL mapping data sets and provide access to fast algorithms to map single loci and epistatic interactions. As is true in all QTL mapping studies, the final steps in defining DNA variants that cause variation in traits are usually difficult and require a second round of experimentation. This is especially the case for trans eQTLs that do not benefit from the strong prior probability that relevant variants are in the immediate vicinity of the parent gene. Statistical, graphical, and bioinformatic methods are used to evaluate positional candidate genes and entire systems of interactions.^[18]^[19] The development of single cell technologies, and parallel advances in statistical methods has made it possible to define even subtle changes in eQTLs as cell-states change.^[20]^[21]

Full text link : https://en.wikipedia.org/wiki/Expression_quantitative_trait_loci

2024.04.12

Proteomics

Proteomics is the large-scale study of proteins.^[1]^[2] Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In addition, other kinds of proteins include antibodies that protect an organism from infection, and hormones that send important signals throughout the body.

The proteome is the entire set of proteins produced or modified by an organism or system. Proteomics enables the identification of ever-increasing numbers of proteins. This varies with time and distinct requirements, or stresses, that a cell or organism undergoes.^[3]

Proteomics is an interdisciplinary domain that has benefited greatly from the genetic information of various genome projects, including the Human Genome Project.^[4] It covers the exploration of proteomes from the overall level of protein composition, structure, and activity, and is an important component of functional genomics.

Proteomics generally denotes the large-scale experimental analysis of proteins and proteomes, but often refers specifically to protein purification and mass spectrometry. Indeed, mass spectrometry is the most powerful method for analysis of proteomes, both in large samples composed of millions of cells^[5] and in single cells.^[6]^[7]

Full text link : https://en.wikipedia.org/wiki/Proteomics

Omics

The branches of science known informally as omics are various disciplines in biology whose names end in the suffix -omics, such as genomics, proteomics, metabolomics, metagenomics, phenomics and transcriptomics. Omics aims at the collective characterization and quantification of pools of biological molecules that translate into the structure, function, and dynamics of an organism or organisms.^[1]

The related suffix -ome is used to address the objects of study of such fields, such as the genome, proteome or metabolome respectively. The suffix -ome as used in molecular biology refers to a totality of some sort; it is an example of a "neo-suffix" formed by abstraction from various Greek terms in -ωμα, a sequence that does not form an identifiable suffix in Greek.

Functional genomics aims at identifying the functions of as many genes as possible of a given organism. It combines different -omics techniques such as transcriptomics and proteomics with saturated mutant collections.^[2]

Full text link : https://en.wikipedia.org/wiki/Omics

-ology

An ology or -logy is a scientific discipline.

Protein

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

A linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypeptide. Short polypeptides, containing less than 20–30 residues, are rarely considered to be proteins and are commonly called peptides. The individual amino acid residues are bonded together by peptide bonds and adjacent amino acid residues. The sequence of amino acid residues in a protein is defined by the sequence of a gene, which is encoded in the genetic code. In general, the genetic code specifies 20 standard amino acids; but in certain organisms the genetic code can include selenocysteine and—in certain archaea—pyrrolysine. Shortly after or even during synthesis, the residues in a protein are often chemically modified by post-translational modification, which alters the physical and chemical properties, folding, stability, activity, and ultimately, the function of the proteins. Some proteins have non-peptide groups attached, which can be called prosthetic groups or cofactors. Proteins can also work together to achieve a particular function, and they often associate to form stable protein complexes.

Once formed, proteins only exist for a certain period and are then degraded and recycled by the cell's machinery through the process of protein turnover. A protein's lifespan is measured in terms of its half-life and covers a wide range. They can exist for minutes or years with an average lifespan of 1–2 days in mammalian cells. Abnormal or misfolded proteins are degraded more rapidly either due to being targeted for destruction or due to being unstable.

Like other biological macromolecules such as polysaccharides and nucleic acids, proteins are essential parts of organisms and participate in virtually every process within cells. Many proteins are enzymes that catalyse biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle and the proteins in the cytoskeleton, which form a system of scaffolding that maintains cell shape. Other proteins are important in cell signaling, immune responses, cell adhesion, and the cell cycle. In animals, proteins are needed in the diet to provide the essential amino acids that cannot be synthesized. Digestion breaks the proteins down for metabolic use.

Proteins may be purified from other cellular components using a variety of techniques such as ultracentrifugation, precipitation, electrophoresis, and chromatography; the advent of genetic engineering has made possible a number of methods to facilitate purification. Methods commonly used to study protein structure and function include immunohistochemistry, site-directed mutagenesis, X-ray crystallography, nuclear magnetic resonance and mass spectrometry.

PPI (Protein-Protein interaction)

Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and the hydrophobic effect. Many are physical contacts with molecular associations between chains that occur in a cell or in a living organism in a specific biomolecular context.

Proteins rarely act alone as their functions tend to be regulated. Many molecular processes within a cell are carried out by molecular machines that are built from numerous protein components organized by their PPIs. These physiological interactions make up the so-called interactomics of the organism, while aberrant PPIs are the basis of multiple aggregation-related diseases, such as Creutzfeldt–Jakob and Alzheimer's diseases.

PPIs have been studied with many methods and from different perspectives: biochemistry, quantum chemistry, molecular dynamics, signal transduction, among others.^[1]^[2]^[3] All this information enables the creation of large protein interaction networks^[4] – similar to metabolic or genetic/epigenetic networks – that empower the current knowledge on biochemical cascades and molecular etiology of disease, as well as the discovery of putative protein targets of therapeutic interest.

full text link : https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction

String

in computer sciece

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.

Depending on the programming language and precise data type used, a variable declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ dynamic allocation to allow it to hold a variable number of elements.

When a string appears literally in source code, it is known as a string literal or an anonymous string.^[1]

In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set called an alphabet.

full text link : https://en.wikipedia.org/wiki/String_(computer_science)

in structure
String is a long flexible structure made from fibers twisted together into a single strand, or from multiple such strands which are in turn twisted together. String is used to tie, bind, or hang other objects. It is also used as a material to make things, such as textiles, and in arts and crafts. String is a simple tool, and its use by humans is known to have been developed tens of thousands of years ago.^[1] In Mesoamerica, for example, string was invented some 20,000 to 30,000 years ago, and was made by twisting plant fibers together.^[1] String may also be a component in other tools, and in devices as diverse as weapons, musical instruments, and toys.

full text link : https://en.wikipedia.org/wiki/String_(structure)

2024.04.19

P-value

In null-hypothesis significance testing, the 𝑝-value^{[note 1]} is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.^[2]^[3] A very small p-value means that such an extreme observed outcome would be very unlikely under the null hypothesis. Even though reporting p-values of statistical tests is common practice in academic publications of many quantitative fields, misinterpretation and misuse of p-values is widespread and has been a major topic in mathematics and metascience.^[4]^[5] In 2016, the American Statistical Association (ASA) made a formal statement that "p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone" and that "a p-value, or statistical significance, does not measure the size of an effect or the importance of a result" or "evidence regarding a model or hypothesis".^[6] That said, a 2019 task force by ASA has issued a statement on statistical significance and replicability, concluding with: "p-values and significance tests, when properly applied and interpreted, increase the rigor of the conclusions drawn from data".^[7]

In statistics, every conjecture concerning the unknown probability distribution of a collection of random variables representing the observed data 𝑋 in some study is called a statistical hypothesis. If we state one hypothesis only and the aim of the statistical test is to see whether this hypothesis is tenable, but not to investigate other specific hypotheses, then such a test is called a null hypothesis test.

As our statistical hypothesis will, by definition, state some property of the distribution, the null hypothesis is the default hypothesis under which that property does not exist. The null hypothesis is typically that some parameter (such as a correlation or a difference between means) in the populations of interest is zero. Our hypothesis might specify the probability distribution of 𝑋 precisely, or it might only specify that it belongs to some class of distributions. Often, we reduce the data to a single numerical statistic, e.g., 𝑇, whose marginal probability distribution is closely connected to a main question of interest in the study.

The p-value is used in the context of null hypothesis testing in order to quantify the statistical significance of a result, the result being the observed value of the chosen statistic 𝑇.^{[note 2]} The lower the p-value is, the lower the probability of getting that result if the null hypothesis were true. A result is said to be statistically significant if it allows us to reject the null hypothesis. All other things being equal, smaller p-values are taken as stronger evidence against the null hypothesis.

Loosely speaking, rejection of the null hypothesis implies that there is sufficient evidence against it.

As a particular example, if a null hypothesis states that a certain summary statistic 𝑇 follows the standard normal distribution 𝑁(0,1), then the rejection of this null hypothesis could mean that (i) the mean of 𝑇 is not 0, or (ii) the variance of 𝑇 is not 1, or (iii) 𝑇 is not normally distributed. Different tests of the same null hypothesis would be more or less sensitive to different alternatives. However, even if we do manage to reject the null hypothesis for all 3 alternatives, and even if we know that the distribution is normal and variance is 1, the null hypothesis test does not tell us which non-zero values of the mean are now most plausible. The more independent observations from the same probability distribution one has, the more accurate the test will be, and the higher the precision with which one will be able to determine the mean value and show that it is not equal to zero; but this will also increase the importance of evaluating the real-world or scientific relevance of this deviation.

full text link : https://en.wikipedia.org/wiki/P-value

Log

In mathematics, the logarithm is the inverse function to exponentiation. That means that the logarithm of a number x to the base b is the exponent to which b must be raised to produce x. For example, since 1000 = 10³, the logarithm base 10 of 1000 is 3, or log₁₀ (1000) = 3. The logarithm of x to base b is denoted as log_b (x), or without parentheses, log_b x. When the base is clear from the context or is irrelevant, such as in big O notation, it is sometimes written log x.

The logarithm base 10 is called the decimal or common logarithm and is commonly used in science and engineering. The natural logarithm has the number e ≈ 2.718 as its base; its use is widespread in mathematics and physics, because of its very simple derivative. The binary logarithm uses base 2 and is frequently used in computer science.

Logarithms were introduced by John Napier in 1614 as a means of simplifying calculations.^[1] They were rapidly adopted by navigators, scientists, engineers, surveyors, and others to perform high-accuracy computations more easily. Using logarithm tables, tedious multi-digit multiplication steps can be replaced by table look-ups and simpler addition. This is possible because the logarithm of a product is the sum of the logarithms of the factors: log𝑏⁡(𝑥𝑦)=log𝑏⁡𝑥+log𝑏⁡𝑦,

provided that b, x and y are all positive and b ≠ 1. The slide rule, also based on logarithms, allows quick calculations without tables, but at lower precision. The present-day notion of logarithms comes from Leonhard Euler, who connected them to the exponential function in the 18th century, and who also introduced the letter e as the base of natural logarithms.^[2]

Logarithmic scales reduce wide-ranging quantities to smaller scopes. For example, the decibel (dB) is a unit used to express ratio as logarithms, mostly for signal power and amplitude (of which sound pressure is a common example). In chemistry, pH is a logarithmic measure for the acidity of an aqueous solution. Logarithms are commonplace in scientific formulae, and in measurements of the complexity of algorithms and of geometric objects called fractals. They help to describe frequency ratios of musical intervals, appear in formulas counting prime numbers or approximating factorials, inform some models in psychophysics, and can aid in forensic accounting.

The concept of logarithm as the inverse of exponentiation extends to other mathematical structures as well. However, in general settings, the logarithm tends to be a multi-valued function. For example, the complex logarithm is the multi-valued inverse of the complex exponential function. Similarly, the discrete logarithm is the multi-valued inverse of the exponential function in finite groups; it has uses in public-key cryptography.

full text link : https://en.wikipedia.org/wiki/Logarithm

Likelihood

The likelihood function (often simply called the likelihood) is the joint probability mass (or probability density) of observed data viewed as a function of the parameters of a statistical model.^[1]^[2]^[3] Intuitively, the likelihood function 𝐿(𝜃∣𝑥) is the probability of observing data 𝑥 assuming 𝜃 is the actual parameter.

In maximum likelihood estimation, the arg max (over the parameter 𝜃) of the likelihood function serves as a point estimate for 𝜃, while the Fisher information (often approximated by the likelihood's Hessian matrix) indicates the estimate's precision.

In contrast, in Bayesian statistics, parameter estimates are derived from the converse of the likelihood, the so-called posterior probability, which is calculated via Bayes' rule.^[4]

The likelihood function, parameterized by a (possibly multivariate) parameter 𝜃, is usually defined differently for discrete and continuous probability distributions (a more general definition is discussed below). Given a probability density or mass function

𝑥↦𝑓(𝑥∣𝜃),

where 𝑥 is a realization of the random variable 𝑋, the likelihood function is 𝜃↦𝑓(𝑥∣𝜃),

often written
𝐿(𝜃∣𝑥).

In other words, when 𝑓(𝑥∣𝜃) is viewed as a function of 𝑥 with 𝜃 fixed, it is a probability density function, and when viewed as a function of 𝜃 with 𝑥 fixed, it is a likelihood function. In the frequentist paradigm, the notation 𝑓(𝑥∣𝜃) is often avoided and instead 𝑓(𝑥;𝜃) or 𝑓(𝑥,𝜃) are used to indicate that 𝜃 is regarded as a fixed unknown quantity rather than as a random variable being conditioned on.

The likelihood function does not specify the probability that 𝜃 is the truth, given the observed sample 𝑋=𝑥. Such an interpretation is a common error, with potentially disastrous consequences (see prosecutor's fallacy).

full text link : https://en.wikipedia.org/wiki/Likelihood_function

E-value

In statistical hypothesis testing, e-values quantify the evidence in the data against a null hypothesis (e.g., "the coin is fair", or, in a medical context, "this new treatment has no effect"). They serve as a more robust alternative to p-values, addressing some shortcomings of the latter.

In contrast to p-values, e-values can deal with optional continuation: e-values of subsequent experiments (e.g. clinical trials concerning the same treatment) may simply be multiplied to provide a new, "product" e-value that represents the evidence in the joint experiment. This works even if, as often happens in practice, the decision to perform later experiments may depend in vague, unknown ways on the data observed in earlier experiments, and it is not known beforehand how many trials will be conducted: the product e-value remains a meaningful quantity, leading to tests with Type-I error control. For this reason, e-values and their sequential extension, the e-process, are the fundamental building blocks for anytime-valid statistical methods (e.g. confidence sequences). Another advantage over p-values is that any weighted average of e-values remains an e-value, even if the individual e-values are arbitrarily dependent. This is one of the reasons why e-values have also turned out to be useful tools in multiple testing.^[1]

E-values can be interpreted in a number of different ways: first, the reciprocal of any e-value is itself a p-value, but a special, conservative one, quite different from p-values used in practice. Second, they are broad generalizations of likelihood ratios and are also related to, yet distinct from, Bayes factors. Third, they have an interpretation as bets. Finally, in a sequential context, they can also be interpreted as increments of nonnegative supermartingales. Interest in e-values has exploded since 2019, when the term 'e-value' was coined and a number of breakthrough results were achieved by several research groups. The first overview article appeared in 2023.^[2]

Let the null hypothesis 𝐻0 be given as a set of distributions for data 𝑌. Usually 𝑌=(𝑋1,…,𝑋𝜏) with each 𝑋𝑖 a single outcome and 𝜏 a fixed sample size or some stopping time. We shall refer to such 𝑌, which represent the full sequence of outcomes of a statistical experiment, as a sample or batch of outcomes. But in some cases 𝑌 may also be an unordered bag of outcomes or a single outcome.

An e-variable or e-statistic is a nonnegative random variable 𝐸=𝐸(𝑌) such that under all 𝑃∈𝐻0, its expected value is bounded by 1:

𝐸𝑃[𝐸]≤1.

The value taken by e-variable 𝐸 is called the e-value. In practice, the term e-value (a number) is often used when one is really referring to the underlying e-variable (a random variable, that is, a measurable function of the data).

full text link : https://en.wikipedia.org/wiki/E-values

2024.05.03