PDA

View Full Version : E8500 = Cold Bug


4Qman
26-01-2008, 10:16 AM
Today tried E8500 on cascade using my cold frendly P5K-E which boots at -140oC under LN2..

This chip Wont boot anthink under -85oC. At -80oC it boots at 5.3Ghz fine but anythink colder Lockup and no post, tried all settings. Oh well..

CPU - Q746A363

http://i19.photobucket.com/albums/b167/4qman/e8500.jpg

K404
26-01-2008, 10:22 AM
Hold up- didnt you guys work around this on the 27th? Boot at -40 or whatever, then let the temp work down in windows? Yea, still a hassle, but as long as theres a core mod on the mobo, can keep cranking the voltage in realtime and clock in windows.

4Qman
26-01-2008, 10:24 AM
Hold up- didnt you guys work around this on the 27th? Boot at -40 or whatever, then let the temp work down in windows? Yea, still a hassle, but as long as theres a core mod on the mobo, can keep cranking the voltage in realtime and clock in windows.

Still lockup when reaching -85 to -90oC

Cold bug / cold boot :ohmy:

Not good.

weescott
26-01-2008, 10:31 AM
I had a similar issue with my E8500 but at higher temps. :(

maverik-sg1
26-01-2008, 10:41 AM
BIOS update on the board? Try and older board?

4Qman
26-01-2008, 10:44 AM
BIOS update on the board? Try and older board?

Done this mate.

CPU is the problem.

K404
26-01-2008, 10:52 AM
http://rayer.ic.cz/hardware/core2/core2.gif

Pad M2: ThermTrip

AK 1 and AL 1: Therm

Not tested, no idea if that'll disable the temp diode, please dont blame me if it kills the chip! Will keep hunting later on

4Qman
26-01-2008, 10:54 AM
thanks Ken..

K404
26-01-2008, 11:17 AM
Thanks for the pic Dave- heres the pads I've named in the post above: (your labelling was spot on Dave, just got your 2nd PM)

http://img244.imageshack.us/img244/1508/45nmxf5gi0.jpg (http://img244.imageshack.us/my.php?image=45nmxf5gi0.jpg)

Baz had something on this as well, but I cant find the thread

4Qman
26-01-2008, 11:18 AM
Good man Ken.

Doing now :coffee: Lets hope it helps her out.

K404
26-01-2008, 11:19 AM
Fingers crossed the chip dont die!

4Qman
26-01-2008, 11:22 AM
Fingers crossed the chip dont die!

This is correct over what i marked in the pic yes ken? Only asking as i thought the CPU was at a different angle to the data sheet.

EDIT
Ken i changed the angle of the second pic i sent. Check you pic mate and re lable the 2nd one i sent and edit post please.

4Qman
26-01-2008, 12:04 PM
Ok ken have done the points marked in RED and booting as normal.

Trying Cascade now.

K404
26-01-2008, 12:08 PM
Dohh Too many pics now! The ones I marked in red are fine- I took the empty pads as markers (one off, 3 on, 2 off) and worked in the same column to them, and the column next to them.

4Qman
26-01-2008, 12:47 PM
No luck ken.

Still hates the cold. I am going to recheck the paint is still on pads.

Stocky
26-01-2008, 01:09 PM
There's impressive stuff going on in this thread! I didn't know you could bloack the thermal sensor pins like that?! I presume that you'll have to use a good paint and make sure it hardens so that the pins don't break through in the socket.

Dualist
26-01-2008, 01:16 PM
Paint maybe conductive, I use selotape for the pads on a BSEL mod ;)

Raja
26-01-2008, 01:22 PM
Blocking the pad will make no diff. The CPU kills it's internal clock before sending out the kill signal to the PWM..

regards
Raja

K404
26-01-2008, 01:33 PM
:( Thanks for the info Raja. Is there no way to bypass the mechanism at all?

This is assuming its a "mistake" on the CPUs part, and not a batch-related silicon limit.

Raja
26-01-2008, 02:28 PM
hi,

Unfortunately, no external bypass is possible.. Once again, it's gonna be down to the luck of the draw..

regards
raja

Dualist
26-01-2008, 02:38 PM
Looks like you're gonna have to run it on a single stager then Dave :(

But it should run ok on that new build though mate ;)

Johnny Bravo
26-01-2008, 02:49 PM
hi,

Unfortunately, no external bypass is possible.. Once again, it's gonna be down to the luck of the draw..

regards
raja

so what is the mechanism of a coldbug - is it that there is a thermal limit at which the cpu develops a fault or a thermal limit that triggers a flag which stops the cpu?


I recall the badaxe requiring a tcore mod, but this I think was more to do with the motherboard shutting down the cpu than the cpu failing ?

Raja
26-01-2008, 03:12 PM
Tcmin is the min temp threshold - safety feature . Upon breach, this flags/triggers Thermtrip, which kills the internal clock and then sends the output signal via M2 to the PWM. The is feature is not contorollable via an external source/software.. To answer your question in short, it's a hardware safety feature.


regards
Raja

Johnny Bravo
26-01-2008, 03:33 PM
so in some aspect we are actually looking for cpus with "faulty" Tcmin....easy as that :S

SoddemFX
26-01-2008, 08:44 PM
Raja,

Is Tcmin an independent register, or is it the result of an overflow of Tcmax, if such a thing exists?

When you say it's luck of the draw, are different CPU's programmed with different Tcmin or are they all programmed the same and sensor error giving the different real "bug" temperatures?

Tom

Raja
26-01-2008, 09:03 PM
TcMin is an independent register from TcMax, obviously they both trigger the same shut-off sequence (naturally). The rest is probably largely dependent on drift within the internal temp diode when supercooled. The shutdown conditions/procedures of Tcmin and Tcmax are outlined in the QX9650 white paper.

regards
Raja

I.L.P
26-01-2008, 11:09 PM
why set a minimum? surely can't kill a CPU with too much cold?

Raja
26-01-2008, 11:33 PM
There's so much that could be to blame for the inclusion of a purposeful failsafe. 45nm Hafnium has many qualities, but as normal, there is no such thing as a free lunch. High gate capacitance is one such issue, requiring large levels of drive current to overcome. It is fairly impossible to fully predict how well the Hafnium transistors and current drive transistors cope with subzero temps and exuberant levels of Vcore. Intel obviously sensed the need for a failsafe..


regards
Raja

I.L.P
26-01-2008, 11:40 PM
They should just put a warning label on the box about extreme temps and then let us decide how we run them really, not as if joe average who buy's a dell is gonna slap an LN2 tube on it, anyone who's at those temps usually has money to burn anyway and a long list of dead hardware.

Raja
26-01-2008, 11:46 PM
It may not actually be an outright failure that is being protected, ever wondered about the possibility of loss of performance at higher voltages? There's more to it than assuming that this is all to do with device failure. People still have not fully grasped the changes that this new process brings..



regards
Raja

Dualist
27-01-2008, 12:12 AM
They should just put a warning label on the box about extreme temps and then let us decide how we run them really, not as if joe average who buy's a dell is gonna slap an LN2 tube on it, anyone who's at those temps usually has money to burn anyway and a long list of dead hardware.

Any more than the standard cooler and volts then your warranty is invalid.
They aren't gonna say 'if you run below -85c this chip will malfunction' are they. They aren't gonna test every cpu at every single minus centigrade, if they did we'll be looking at cpus at a £1000 a piece..!! :huh:
God knows what the cost of a garrented chip to run at -190c would cost.