Talk:Baum–Welch algorithm

	This article is within the scope of WikiProject Molecular Biology, a collaborative effort to improve the coverage of Molecular Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Molecular BiologyWikipedia:WikiProject Molecular BiologyTemplate:WikiProject Molecular BiologyMolecular Biology articles
???	This article has not yet received a rating on the importance scale.
	This article is supported by the Computational Biology task force (assessed as Mid-importance).

Computing Low‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.

Statistics Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Low	This article has been rated as Low-importance on the importance scale.

Mathematics Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
Low	This article has been rated as Low-priority on the project's priority scale.

Suggestion

Could Template:HMM example be put to use on this page? (Granted we might have to modify it a bit to make it more general) -- Kowey 14:16, 14 Nov 2004 (UTC)

Content Discussion

Latest comment: 16 years ago2 comments2 people in discussion

There is a document written by Welch on IEEE Information Theory Society Newslectter, December 2003. See http://www.itsoc.org/publications/nltr/it_dec_03final.pdf

I scrutinized the description, and I also think it is wrong.

We have: gamma_t(i): the probability that an HMM is in state s_i at time t given the observations O and the parameters Lambda. (1)

gamma_t(i)=P(s_i=q_t | O, Lambda) (2)

gamma_t(i)=alpha_t(i)*beta_t(i)

          --------------------          (3)
            P(O|Lambda)

(3) is the Forward-Backward probability of state s_t(i), the probability one is in state s_i at t given time t. So this is the role the Foward-Backward algorithm plays in B W.

Now, let's look at the probability of going through a TRANSITION from s_i at time t to s_j at time t+1. Let's call it "chi'

Chi(i,j) is the probability of being in state s_i at time t, and being in state s_j at time t+1 given the observations and the parameters. (4)

Chi(i,j)=P(s_i=q_t, s_j=q_t+1 | O, Lambda) (5)

Chi(i,j)=alpha_t(i)a_i_jb_j(ot+1)*beta_t+1(j)

        --------------------
          P(O|Lambda)      (6)

T-1

SIGMA Chi(i,j) is the expected number of

   t=1
       transitions from s_i to s_j

T-1

SIGMA gamma_t(i) is the expected number of

   t=1
       transitions from s_i

                  T-1
              SIGMA   Chi(i,j)
                  t=1

Updated a_i_j =------------------------

                  T-1
              SIGMA   gamma_t_(i)
                  t=1

It is the last step I am not getting. Certainly the enumerator is not constant, as the current article claims, nor is it equal to the probability of the entire string. I cannot see why updated a_i_j is greater than the old value.

References

Cutting, D., and Kupiec, J., Pedersen, and P Sibun "A Practical Part-of-Speech Tagger", Xerox Palo Alto Research Center.

ChengXiang Zhai, A Brief Note On The Hidden Markov Models, Lecture Notes, Dept. of CS, University of Illinois at Urbana-Champaign.

tp://ocw.mit.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-345Automatic-Speech-RecognitionSpring2003/4C063D4A-3B8B-4F05-B475-85AAD0D66586/0/lecture10.pdf

Koos van der Wilt Koos van der Wilt 00:52, 13 May 2007 (UTC)Reply

Were you discussing the article or the Welch presentation? Which one do you feel wrong? I think the latter, please confirm. I've not read your contribution, but I remember several headaches when trying to make sense of this algorithm in Pattern Classification (2ed) by Richard Duda et al. (ISBN 0 471 05669 3); I was convinced the description was wrong, and I rememeber I thought it was difficult to fix it.

--Blaisorblade (talk) 03:09, 24 June 2008 (UTC)Reply

Confusion over Forward-backward algorithm

Latest comment: 14 years ago2 comments2 people in discussion

I think the link is confusing. The current text of Forward-backward algorithm seems to describe a different algorithm, which uses a forward and a backward pass but combines them for other purposes (at least, it appears so to me). Also the Baum-Welch algorithm does such a forward and backward pass, so it is called forward-backward algorithm somewhere (for instance in Pattern Classification (2ed) by Richard Duda et al. (ISBN 0 471 05669 3)). --Blaisorblade (talk) 03:09, 24 June 2008 (UTC)Reply

http://bits.wikimedia.org/skins-1.5/common/images/button_math.pnghttp://bits.wikimedia.org/skins-1.5/common/images/button_media.png —Preceding unsigned comment added by 210.83.214.164 (talk) 03:22, 25 February 2010 (UTC)Reply

Ambiguous indices

Latest comment: 10 years ago1 comment1 person in discussion

In the formula for the epsilon variables, i and j are both free indices, and used for summation. 2001:6B0:1:1DF0:C185:CDB5:7AF7:29BA (talk) 17:58, 19 October 2013 (UTC)Reply

Confusing chicken and egg example

Latest comment: 5 years ago2 comments2 people in discussion

There is one chicken and we observe whether it lays eggs or not at a day. Why are the observations then like NN, NE, EE, EN, that is, why are there two observations each day? I don't get it. HenningThielemann (talk) 17:33, 25 January 2015 (UTC)Reply

   In answer to HenningThielman: I assume from reading the article that the listed "observations" are supposed to represent observed transitions (pairs of consecutive days). That is, observations are listed as NN, NN, NN, NN, NE, EE, EN, NN, NN, which is consistent with the actual sequence of events being NNNNEENNN. This could be clearer, I'll make an edit.  — Preceding unsigned comment added by 49.184.214.178 (talk) 05:24, 9 January 2019 (UTC)Reply

Working out the probability all day works up an appetite, wherefore the chicken is fried at suppertime. (Probability of laying an egg drops to zero.)

Convergence?

Latest comment: 6 years ago2 comments1 person in discussion

The description ends with "These steps are now repeated iteratively until a desired level of convergence." What steps? Although it would be nice to specify how convergence is measured/detected, I'll leave that aside for now. Isn't the algorithm meant to be evaluated for T=1, then T=2, T=3, and so on until a desired level of convergence is attained? If true, the text should reflect that (it currently does not, since the "steps" are not linked with T). Urhixidur (talk) 15:54, 24 October 2017 (UTC)Reply

Another thing that is sadly missing from the article is the measure of fitness of any given solution. Training an HMM over various data sets (of the same phenomenon) with the same parameters (number of hidden states, basically), how does one pick the "best" one? Urhixidur (talk) 17:34, 24 October 2017 (UTC)Reply

Reproduction

Latest comment: 2 years ago1 comment1 person in discussion

I have written the following code to reproduce the numbers shown in the example.

!pip install hmmlearn
import numpy as np
from hmmlearn import hmm
np.random.seed(42)
model = hmm.MultinomialHMM(n_components=2, n_iter=100)
model.startprob_ = np.array([0.2, 0.8])
model.transmat_ = np.array([[0.5, 0.5],
                            [0.3, 0.7]])
model.emissionprob_ = np.array([[0.3, 0.7],
                               [0.8, 0.2]])
model.sample(100)
model.fit([[0,0,0,0,0,1,1,0,0,0]])

Chrism2671 (talk) 16:57, 11 April 2022 (UTC)Reply

Add topic