AMD Processors
Decrease font size
Increase font size
Topic Title: Change the architecture K10
Topic Summary:
Created On: 08/18/2009 12:39 AM
Status: Read Only
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
Answer This question was answered by go_for, on Sunday, January 20, 2013 11:57 PM

Answer:
Opteron 2, have you signed in AMD dev forums yet?

http://forums.amd.com/devforum/
 08/18/2009 12:39 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009



1. remove L2.
2. 4 blocks to create the analysis.
3. Create a common unit calculation.
4. Kesh place between the two blocs to uskoeniya work.

Units of analysis must share data on the status of processing.

What say the representatives of AMD??
 08/26/2009 05:07 PM
User is offline View Users Profile Print this message

Author Icon
HaricotVert
B1FF

Posts: 106
Joined: 05/28/2009

Originally posted by: OPTERON 2???

What say the representatives of AMD??


I'm pretty sure they'd say that they don't take suggestions on public forums...

Might want to try http://forums.amd.com/devforum/ for a more receptive audience than the end-user troubleshooting/info forums.

Edited: 08/27/2009 at 12:30 AM by HaricotVert
 08/29/2009 06:01 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009



I had in view of approximately this circuit. This future ???. PHENOM architecture deadlock.
The first step, they have increased kernels, optimised cache operation.
The second step, Shanghai.
The third step, is required redesign of kernels. Optimisation of streams.

I here thought, it is necessary to start with webs (spider) by development. Streams should have possibility to go on the shortest paths.
I hope they over it I work.
 10/10/2009 06:31 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

I here have understood recently that that I drew that already is used in nehalem. You look on nvidia, they are ready to change all for 2 months.
If for engineers AMD cards everything are opened as how to do? Why you cannot make, that is already drawn? Or to do much faster all. Engineers brake, processors brake!

It is ingenious, and should leave!




I had it in view of! Look above two schemes. There is a wheel, anybody is better than a wheel has thought up of nothing since 18 centuries.


The most difficult in this work to create the most shortest ways on which the data should move. In the mathematician to you would put 2, for PHEHOM. Irrationally, also should be laid off! Your profit will grow from rationality.

If to make the processor on the basis of a web? Why the spider does for itself a web??



Your data as the wheel which goes on road. It is a circle! Recurrence and repeatability!


Edited: 10/10/2009 at 06:56 AM by OPTERON 2???
 05/21/2010 11:45 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009



I would suggest to modify the scheme bulldozer.
Calling system Cyclone.
Data by central controller are processed quickly.



a ring of processing



IMU should be twice. The data are processed may be placed in the "DATA CACHE". The process must go through certain stages of filtration. Treatment should be started earlier than the data will be processed. Flows should be marked. The central controller must monitor the data that are in "DATA CACHE".

There should be a part of the 8 streams of data that is filtered.

After decoding the data should be grouped, and therefore do not send them directly to processing and prepare the group for the processing of 8 threads.
The central controller must traced, that the groups are filled. It can also disable the kernel that do not involve
The central controller and "DATE CASHE" should be merged. How does the brain in humans.

Edited: 05/21/2010 at 10:26 PM by OPTERON 2???
 08/30/2010 02:19 PM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

I propose to use to combine all the belts. Make them multistage. In order, the data do not move back into the cache

multifunctional conveyor needs. We must work on the vicissitude of the nuclei.
 08/30/2010 02:34 PM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

http://www.mkgt.ru/files/mater...atic/138/glava_35.htm
Conveyor and superscalar processing
Parallelism at the level of instruction execution pipeline and load planning methodology unfolding cycles
 08/30/2010 02:37 PM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

Fundamentals of planning load conveyor and deployment cycles

To maintain maximum loading conveyor should be used concurrency level commands, based on the identification of sequences of unrelated instructions that can be performed in a pipeline with reconciling. To avoid suspension conveyor dependent command must be separated from the original team at a distance in clock cycles equal to the delay of the conveyor for the original team. The ability of the compiler to perform such planning depends on the degree of parallelism level commands available in the program, and on the delay of the functional devices in the pipeline. In this chapter we assume the delay shown in Figure 5.24, unless explicitly not installed other delays. We assume that the conditional transitions have a delay of one clock cycle, so that the command following the command of transition can not be determined during one cycle after the command of a conditional transition. We assume that the functional unit is fully pipelined or duplicated (as many times as the depth of the conveyor), so that the operation of any type may be issued for execution in each cycle and structural conflicts are absent.
 08/30/2010 03:01 PM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009


From edge to center offset. If you are a miss, conveyor immediately filled with other data, while there is a search. These reserved. You do not move the data, and use registers RISC
The process can not stop, but simply to retrieve data from memory
Suppose that each ring is a stage of processing. To increase speed, you can produce a displacement of the ring. Performing pasting data. We can also fill conveyor of information, not stopping. To increase the speed you can move around the ring. And if the command is repeated, it can simply duplicate.

It should produce, displacement

Edited: 08/30/2010 at 03:14 PM by OPTERON 2???
 08/31/2010 09:40 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009


PORT? This may have a ring processing at INTEL, called PORT (ring-processing) had taken command. What I described about
In processors INTEL processing goes through the shift between ports. There is a table of registers to avoid losing treatment cycles
Port for example is part of the conveyer, and the processor switches tasks already in the planning process

Edited: 08/31/2010 at 09:52 AM by OPTERON 2???
 12/25/2010 01:38 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

4 operations for 1 time

http://rubik-effects.com/view_post.php?id=60
http://www.rubiks.com/
the meaning of the Rubik's Cube, you simultaneously perform 4 operations. If there is a scheme run, you simply create a model of the operation processor.
when we create a movement that must do 3 other parties. At this time, they must move in the direction of addition operations. At a time when we were a part of the catches, should already be marked with three other parts.
The essence of the Rubik's Cube 4 core processor, then that any operation can be shifted to the correct sector calculation. Intel uses around this model for their ports.

We begin the computation without waiting for, we accumulate performed tasks at the same time, the processor must analyze to what tasks are already performed surgery. On the basis of already completed transactions, we are forming a new command queue. For example, some teams, we can transfer to other modules. For example, the command to reset the "L3" command executed, peregrupirovat them and send a free kernel. Must take into account the number of cycles to perform the operation, the problem may be scattered across multiple cores.

Edited: 12/25/2010 at 02:09 AM by OPTERON 2???
 04/24/2011 04:28 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

!!!
Why that, I have assumed. That ????? uses storage direct access, passing caches. "Register alias table". We will assume that, decoding goes without a stop. The data which hasn't time to arrive on handling is exhausted in a random access memory. And then selectively arrive on handling already ready. It turns out that handling process doesn't stop. Random access memory operation is independent.

"retired registr file" the data doesn't accumulate, they leave. This data can be used by other kernels.

1. Marking of the data in the table.
2. The assembly of the data.
3. Their burst from a kernel.
4. To give the task for other kernels, to search for markers.
5. To collect markers in 1 kernel for the assembly.

Rapid access in storage is for this purpose necessary

Edited: 04/24/2011 at 04:46 AM by OPTERON 2???
 10/16/2011 04:02 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009


Schematic work 3 cores
This interaction scheme can be made for 6 cores at once.

Edited: 10/16/2011 at 04:14 AM by OPTERON 2???
 10/16/2011 04:48 AM
User is offline View Users Profile Print this message

Author Icon
unclefester1
Overclocker

Posts: 629
Joined: 11/29/2008

The communication interface seems simple enough, 1-2-3, A-B-C (K-D-I). Now all you have to do is change the Memory scheduling (Address Strobe Row). To speed up the inter action of the cycles during Column/Row access.

-------------------------
Antec 1650B
PC Power & Cooling Silencer 910w
ASUS M3A79-T DeLuxe
1090T x6 @Boiling in the La-Boratory
Corsair H70
OCZ Reaper 8500 2x2 *Pending
ASUS 5970 (under OC investigation)
EVGA 260 55nm (holding pattern)
SeaGate 320x2 16GB Sata
Creative X-Fi Elite Pro
Logitech Z-5500
XP Pro SP2
 11/27/2011 07:10 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

 11/30/2011 02:58 PM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

The main problem is the weak Bulldozer core. How do I see how you can solve this problem. It is necessary to make the core of the principle of transformers. For example, the task goes to the module, the module analyzing the loading of all modules as well as data obtained from the operating system must decide which mode to switch to it. For single-threaded applications, we get a lot of core multi-threaded, we get a lot of small nuclei. INTEL since 2003 has worked on technology combining the cores. Bulldozer, a CORE QUAD INTEl, in fact. Intel went on to work and all of the nuclei is common RESOURCES for example, the scheme drawn above. The first step in increasing the productivity of bulldozers will fix mode of the module.

load modules 4 * 100% of the switching modules in the mode 8 * 100%. Or the application must be clearly understood that it can use 8 cores.

First you need to bear in mind that applications are not aware that we have 8 cores, so we must do so first we saw the application modules if the application is capable of 100% loaded modules, then we can also switch to the application of 8 nuclei. Just need to work on switching cores from 4 to 8, and from 8 to 4. 4 * 200% = 8* 100% . Ideally, if you do like the Intel you get a 1 * 800% of 1 transmission, on 4 * 200% 2 transmission, 8 * 100% 3 transmission.

Edited: 11/30/2011 at 11:02 PM by OPTERON 2???
 12/14/2011 05:28 AM
User is offline View Users Profile Print this message

Author Icon
OPTERON 2???
Lurker

Posts: 17
Joined: 07/13/2009

In developing the next generation to opt out of the nuclei, and work on common RESOURCES. Primarily to the fact that we have an application that sees 1 core, and must completely download all RESOURCES, two core flow and so on. The total module output must analyze the results obtained, if possible try to use all of ALU and FPU try to switch to different modes of operation. If AMD will continue to work on kernels that will not achieve anything. Start with two nuclei in the end, with three and four and so on. Intel long ago working on the system with the general ressurami .. Tests show that the number of cores there is nothing to be achieved. Part of that right now, can remain in place, so that the frequency remained at this level. But you need to get to all parts of the processor to communicate. You can spend more time developing but you will overtake Intel. Need to start small - 4 core processor.
 12/14/2011 08:37 AM
User is offline View Users Profile Print this message

Author Icon
go_for
Alpha Geek

Posts: 3217
Joined: 01/21/2006

Answer Answer
Opteron 2, have you signed in AMD dev forums yet?

http://forums.amd.com/devforum/

-------------------------
Statistics
112018 users are registered to the AMD Processors forum.
There are currently 0 users logged in.

FuseTalk Hosting Executive Plan v3.2 - © 1999-2014 FuseTalk Inc. All rights reserved.



Contact AMD Terms and Conditions ©2007 Advanced Micro Devices, Inc. Privacy Trademark information