Your flipflop is not delayinsensitive. Only two-phase clocking are. You will need to have two clocks for your DFF - one for master, and one for slave. And you will need to ensure, that they are none-overlapping. In fact, normal DFF's are not delayinsensitive either, and it is a bit mysterious, that they works. Any RTL design, designed with e.g. rising edge-trigged flipflops, are in a functional matter, not designed to work. As example, a shift register, made with DFF's, could shift wrong, if there are delays in the clock path. The design is delaysensitive. However, when you use the build in DFF's, and optimize your clock routes, use DCM's etc, you will typical be lucky, and make design to work. General, some people are afraid of latches. However, it should be DFF's that they are afraid of. Using latches, will not give troubles with delays, as long as you use two phase none-overlapping clocks, for your master and slave latches. A design with latches, using none-overlapping clocks, is delayinsensitive, what means that it will functional work, and it does not matter, which delay you have in any flipflop, in any gate, or in any route, in your design. (However, it do have a maximum frequence..) A design with DFF's will always depend on delays, since at DFF itself, is a delayinsensitive component. And, if you have bad luck in delays in your clock route, it will not work, even it functional should work. IBM and other, that use tests for delayinsensitivity, have manual approved the DFF's to work, since they not passes the test. Some has proved, that a design with edge-trigged DFF's, will never be safe, in a quantum mechanical way. This is because the robustness, depends on small delays in the design. With a two-phase none-overlapping clock, and use of latches, the robustness increases with lower frequences, since there is no critical timing paths. With DFF's there is a critical timing path, in any DFF, also the ones that are on the chip. No edge trigged DFF, could be designed without delaysensitivity. Using latches, avoids this problem. They works. Problem is that people does not understand to use them. In your design, you try to use latches, to make a DFF. The problem is, that DFF's is not a good idea. They are unstable. And that is what you see. The same problem, is with DFF's on the chip, but due to carefull analog design, and trimming of timing trough inverters, and other logics, the problems with DFF's has into some degree, been avoided. However, you still have troubles with delay in clockroutes, since your design, will not be insensitive to component delays. What you should do, is to use a seperate clock for your slave and master flipflop. And you should ensure, that these two clocks, are not active at same time. One way to do this, is to use a normal DFF, and for first clock periode, give a pulse on clk1, and for next clock periode, gives a pulse on clk2. Then you have a delayinsensivive design, that does not cares on delays in any routing. However, the first DFF, should be one of the DFF's on chip, because it cares on routing. But when first, your have generated your two clocks, you are safe. Do never connect any block, clocked by clk1 with a another block, clocked by same clock. Anything, that is output from master outputs, should go to slave inputs. And anything, on slave outputs, should go to master inputs. Any slave, should be on the slave clock. And any masterflipflop, on master clock. And the master, and slave clock, need to be safe none overlapping. Using latches, also gives a better design, with the flow, because it is not split into fixed sized time slots. But your clock will only be half as fast, since you need to divide it by two, to get the two clocks. And the first flipflop, will also to be edge trigged, and delaysensitive. To one operation per clock, you will need to use technics from double-edge trigging, and it takes up a bit more space. If you have enough delay, in your paths, then you could generate the master clock, as an inverted slave clock. However, then your design is not delay insensitive, and not safe. It is in that case better to use the DFF's, because they has been carefull matched, and designed by analog specialists, to give optimal fast output, with high robustness. Even, that they from a digital, and quantum mechanical way, never will be safe. You could make your design work, if you add some delays in your data path. As example, you could use after statements, to make delays. Or your could use inverters, that is not optimized away, to make zero-delay's, that also work as a delay, when you simulate. However, it will not be safe, when you compile it to chip. It is only safe, when you use a two-phase, none-overlapping clock. But then, it is safe. More safe, than standard VHDL designs, and it is roboust to component and temperature variations.